当前位置：文江博客话题详情

String r r-faq

连接字符串/字符向量

发布于 2024-08-18 14:54:44 字数 220 浏览 13 评论 0 原文

如果我有一个字符类型的向量，如何将这些值连接成字符串？下面是我如何使用paste()来做到这一点：

sdata = c('a', 'b', 'c')
paste(sdata[1], sdata[2], sdata[3], sep ='')

产生“abc”。

但当然，只有当我提前知道 sdata 的长度时，这才有效。

原文

If I have a vector of type character, how can I concatenate the values into string? Here's how I would do it with paste():

sdata = c('a', 'b', 'c')
paste(sdata[1], sdata[2], sdata[3], sep ='')

yielding "abc".

But of course, that only works if I know the length of sdata ahead of time.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

伤感在游骋 2024-08-25 14:54:45

您可以将 stri_paste 函数与 stringi 包中的 collapse 参数一起使用，如下所示：

stri_paste(letters, collapse='')
## [1] "abcdefghijklmnopqrstuvwxyz"

以及一些基准测试：

require(microbenchmark)
test <- stri_rand_lipsum(100)
microbenchmark(stri_paste(test, collapse=''), paste(test,collapse=''), do.call(paste, c(as.list(test), sep="")))
Unit: microseconds
                                      expr     min       lq     mean   median       uq     max neval
           stri_paste(test, collapse = "") 137.477 139.6040 155.8157 148.5810 163.5375 226.171   100
                paste(test, collapse = "") 404.139 406.4100 446.0270 432.3250 442.9825 723.793   100
do.call(paste, c(as.list(test), sep = "")) 216.937 226.0265 251.6779 237.3945 264.8935 405.989   100

You can use stri_paste function with collapse parameter from stringi package like this:

stri_paste(letters, collapse='')
## [1] "abcdefghijklmnopqrstuvwxyz"

And some benchmarks:

require(microbenchmark)
test <- stri_rand_lipsum(100)
microbenchmark(stri_paste(test, collapse=''), paste(test,collapse=''), do.call(paste, c(as.list(test), sep="")))
Unit: microseconds
                                      expr     min       lq     mean   median       uq     max neval
           stri_paste(test, collapse = "") 137.477 139.6040 155.8157 148.5810 163.5375 226.171   100
                paste(test, collapse = "") 404.139 406.4100 446.0270 432.3250 442.9825 723.793   100
do.call(paste, c(as.list(test), sep = "")) 216.937 226.0265 251.6779 237.3945 264.8935 405.989   100

回复收藏 0 原文

千笙结 2024-08-25 14:54:45

tidyverse

stringr 包有一些快速的方法可以实现这一点。

str_flatten

默认情况下会折叠字符向量，不带空格，但也有 collapse 参数：

str_flatten(sdata)
# [1] "abc"

还有一个可选的 last 参数用于最终分隔符的位置。

str_c

与 paste 类似，您需要指定 collapse 参数来完成此操作：

str_c(sdata, collapse = "")
# [1] "abc"

str_flatten_comma

自 stringr 1.5.0 如果您想要逗号分隔折叠。这里的last参数识别牛津逗号：

str_flatten_comma(sdata)
# [1] "a, b, c"

str_flatten_comma(sdata[1:2], last = " and ")
# [1] "a and b"

stringfish（更快）

对于大多数情况stringi和stringr将提供足够的速度，但如果你需要一些东西更快地考虑 stringfish 包：

library(stringfish)

sf_collapse(sdata, collapse = '')
# [1] "abc"

Base R

为了完整性，您可以使用 paste0，尽管这里比 paste 没有明显的优势：

paste0(sdata, collapse = "")

基准测试

跨向量大小为 10K、100K、1M 和 10M stringfish 始终能获得更快的结果：

   expression  string_length      min   median `itr/sec` `gc/sec` n_itr  n_gc
   <bch:expr>          <dbl> <bch:tm> <bch:tm>     <dbl>    <dbl> <int> <dbl>
 1 sf_collapse         10000    149µs  163.8µs   6020.       0     3009     0
 2 str_flatten         10000  166.8µs  174.2µs   5527.       0     2762     0
 3 stri_paste          10000  163.9µs    176µs   5517.       0     2757     0
 4 str_c               10000    186µs  200.3µs   4954.       0     2476     0
 5 paste               10000  606.8µs  677.4µs   1472.       2.06   715     1
 6 paste0              10000  606.3µs  681.9µs   1449.       0      725     0
 7 sf_collapse        100000   1.48ms   1.55ms    643.       0      322     0
 8 stri_paste         100000   1.81ms   1.96ms    490.       0      245     0
 9 str_flatten        100000   1.81ms   2.04ms    486.       0      244     0
10 str_c              100000   1.84ms   2.03ms    480.       0      241     0
11 paste0             100000   6.24ms   6.73ms    147.       2.10    70     1
12 paste              100000   6.37ms   6.98ms    142.       0       72     0
13 sf_collapse       1000000  16.02ms  16.88ms     59.4      0       30     0
14 str_flatten       1000000  19.45ms  20.02ms     49.7      0       25     0
15 stri_paste        1000000  19.28ms  20.07ms     49.6      0       25     0
16 str_c             1000000   19.8ms  20.77ms     47.9      0       24     0
17 paste             1000000  64.06ms  65.41ms     15.3      0        8     0
18 paste0            1000000  64.54ms  65.75ms     15.1      0        8     0
19 sf_collapse      10000000 167.88ms 169.53ms      5.91     0        3     0
20 str_c            10000000 199.67ms 200.04ms      4.96     0        3     0
21 stri_paste       10000000 205.17ms 210.69ms      4.76     0        3     0
22 str_flatten      10000000 216.98ms 217.04ms      4.60     0        3     0
23 paste0           10000000 690.85ms 690.85ms      1.45     0        1     0
24 paste            10000000 767.12ms 767.12ms      1.30     1.30     1     1

基准代码

library(bench)
library(stringr)
library(stringi)
library(stringfish)

set.seed(4)
results <- press(
  string_length = c(1E4, 1E5, 1E6, 1E7),
  {
    x <- sample(letters, string_length, replace = T)
    mark(
      stri_paste = stri_paste(x, collapse=''), 
      paste = paste(x,collapse=''),
      str_flatten = str_flatten(x),
      str_c = str_c(x, collapse = ""),
      paste0 = paste0(x, collapse = ""),
      sf_collapse = sf_collapse(x, collapse = ""),
      memory = FALSE)
  }
)


sort_by(results, ~ list(string_length, - `itr/sec`)) |>
  subset(select = c(1:5, 7:9)) |>
  print(n = 24)

tidyverse

The stringr package has a few, fast ways you could accomplish this.

str_flatten

By default will collapse your character vector with no spaces, but does have collapse argument as well:

str_flatten(sdata)
# [1] "abc"

Also has an optional last argument to use in place of the final separator.

str_c

Similar to paste with a collapse argument you need to specify to accomplish this:

str_c(sdata, collapse = "")
# [1] "abc"

str_flatten_comma

New as of stringr 1.5.0 if you want a comma delimited collapse. Here the last argument recognizes the Oxford comma:

str_flatten_comma(sdata)
# [1] "a, b, c"

str_flatten_comma(sdata[1:2], last = " and ")
# [1] "a and b"

stringfish (faster)

For most cases stringi and stringr will provide enough speed, but if you need something faster consider the stringfish package:

library(stringfish)

sf_collapse(sdata, collapse = '')
# [1] "abc"

Base R

For completeness, you could use paste0, though there is no obvious advantage here over paste:

paste0(sdata, collapse = "")

Benchmark

Across vector sizes of 10K, 100K, 1M and 10M stringfish yields consistently faster results:

   expression  string_length      min   median `itr/sec` `gc/sec` n_itr  n_gc
   <bch:expr>          <dbl> <bch:tm> <bch:tm>     <dbl>    <dbl> <int> <dbl>
 1 sf_collapse         10000    149µs  163.8µs   6020.       0     3009     0
 2 str_flatten         10000  166.8µs  174.2µs   5527.       0     2762     0
 3 stri_paste          10000  163.9µs    176µs   5517.       0     2757     0
 4 str_c               10000    186µs  200.3µs   4954.       0     2476     0
 5 paste               10000  606.8µs  677.4µs   1472.       2.06   715     1
 6 paste0              10000  606.3µs  681.9µs   1449.       0      725     0
 7 sf_collapse        100000   1.48ms   1.55ms    643.       0      322     0
 8 stri_paste         100000   1.81ms   1.96ms    490.       0      245     0
 9 str_flatten        100000   1.81ms   2.04ms    486.       0      244     0
10 str_c              100000   1.84ms   2.03ms    480.       0      241     0
11 paste0             100000   6.24ms   6.73ms    147.       2.10    70     1
12 paste              100000   6.37ms   6.98ms    142.       0       72     0
13 sf_collapse       1000000  16.02ms  16.88ms     59.4      0       30     0
14 str_flatten       1000000  19.45ms  20.02ms     49.7      0       25     0
15 stri_paste        1000000  19.28ms  20.07ms     49.6      0       25     0
16 str_c             1000000   19.8ms  20.77ms     47.9      0       24     0
17 paste             1000000  64.06ms  65.41ms     15.3      0        8     0
18 paste0            1000000  64.54ms  65.75ms     15.1      0        8     0
19 sf_collapse      10000000 167.88ms 169.53ms      5.91     0        3     0
20 str_c            10000000 199.67ms 200.04ms      4.96     0        3     0
21 stri_paste       10000000 205.17ms 210.69ms      4.76     0        3     0
22 str_flatten      10000000 216.98ms 217.04ms      4.60     0        3     0
23 paste0           10000000 690.85ms 690.85ms      1.45     0        1     0
24 paste            10000000 767.12ms 767.12ms      1.30     1.30     1     1

Benchmark Code

library(bench)
library(stringr)
library(stringi)
library(stringfish)

set.seed(4)
results <- press(
  string_length = c(1E4, 1E5, 1E6, 1E7),
  {
    x <- sample(letters, string_length, replace = T)
    mark(
      stri_paste = stri_paste(x, collapse=''), 
      paste = paste(x,collapse=''),
      str_flatten = str_flatten(x),
      str_c = str_c(x, collapse = ""),
      paste0 = paste0(x, collapse = ""),
      sf_collapse = sf_collapse(x, collapse = ""),
      memory = FALSE)
  }
)


sort_by(results, ~ list(string_length, - `itr/sec`)) |>
  subset(select = c(1:5, 7:9)) |>
  print(n = 24)

回复收藏 0 原文

歌枕肩 2024-08-25 14:54:45

马特·特纳的答案绝对是正确的答案。然而，本着肯·威廉姆斯回答的精神，你也可以这样做：

capture.output(cat(sdata, sep=""))

Matt Turner's answer is definitely the right answer. However, in the spirit of Ken Williams' answer, you could also do:

capture.output(cat(sdata, sep=""))

回复收藏 0 原文

心舞飞扬 2024-08-25 14:54:45

对于sdata：

gsub(", ", "", toString(sdata))

对于整数向量：

gsub(", ", "", toString(c(1:10)))

For sdata:

gsub(", ", "", toString(sdata))

For a vector of integers:

gsub(", ", "", toString(c(1:10)))

回复收藏 0 原文

你没皮卡萌 2024-08-25 14:54:45

另一种方法是使用glue包：

glue_collapse(glue("{sdata}"))
paste(glue("{sdata}"), collapse = '')

Another way would be to use glue package:

glue_collapse(glue("{sdata}"))
paste(glue("{sdata}"), collapse = '')

回复收藏 0 原文

白龙吟 2024-08-25 14:54:45

这是一个小实用函数，可将命名或未命名的值列表折叠为单个字符串，以便于打印。它还会打印代码行本身。它来自我的 R 页面中的列表示例。

生成一些命名或未命名的列表：

# Define Lists
ls_num <- list(1,2,3)
ls_str <- list('1','2','3')
ls_num_str <- list(1,2,'3')

# Named Lists
ar_st_names <- c('e1','e2','e3')
ls_num_str_named <- ls_num_str
names(ls_num_str_named) <- ar_st_names

# Add Element to Named List
ls_num_str_named$e4 <- 'this is added'

这是将命名或未命名列表转换为字符串的函数：

ffi_lst2str <- function(ls_list, st_desc, bl_print=TRUE) {

  # string desc
  if(missing(st_desc)){
    st_desc <- deparse(substitute(ls_list))
  }

  # create string
  st_string_from_list = paste0(paste0(st_desc, ':'), 
                               paste(names(ls_list), ls_list, sep="=", collapse=";" ))

  if (bl_print){
    print(st_string_from_list)
  }
}

使用之前创建的列表测试该函数：

> ffi_lst2str(ls_num)
[1] "ls_num:=1;=2;=3"
> ffi_lst2str(ls_str)
[1] "ls_str:=1;=2;=3"
> ffi_lst2str(ls_num_str)
[1] "ls_num_str:=1;=2;=3"
> ffi_lst2str(ls_num_str_named)
[1] "ls_num_str_named:e1=1;e2=2;e3=3;e4=this is added"

使用列表元素的子集测试该函数：

> ffi_lst2str(ls_num_str_named[c('e2','e3','e4')])
[1] "ls_num_str_named[c(\"e2\", \"e3\", \"e4\")]:e2=2;e3=3;e4=this is added"
> ffi_lst2str(ls_num[2:3])
[1] "ls_num[2:3]:=2;=3"
> ffi_lst2str(ls_str[2:3])
[1] "ls_str[2:3]:=2;=3"
> ffi_lst2str(ls_num_str[2:4])
[1] "ls_num_str[2:4]:=2;=3;=NULL"
> ffi_lst2str(ls_num_str_named[c('e2','e3','e4')])
[1] "ls_num_str_named[c(\"e2\", \"e3\", \"e4\")]:e2=2;e3=3;e4=this is added"

Here is a little utility function that collapses a named or unnamed list of values to a single string for easier printing. It will also print the code line itself. It's from my list examples in R page.

Generate some lists named or unnamed:

# Define Lists
ls_num <- list(1,2,3)
ls_str <- list('1','2','3')
ls_num_str <- list(1,2,'3')

# Named Lists
ar_st_names <- c('e1','e2','e3')
ls_num_str_named <- ls_num_str
names(ls_num_str_named) <- ar_st_names

# Add Element to Named List
ls_num_str_named$e4 <- 'this is added'

Here is the a function that will convert named or unnamed list to string:

ffi_lst2str <- function(ls_list, st_desc, bl_print=TRUE) {

  # string desc
  if(missing(st_desc)){
    st_desc <- deparse(substitute(ls_list))
  }

  # create string
  st_string_from_list = paste0(paste0(st_desc, ':'), 
                               paste(names(ls_list), ls_list, sep="=", collapse=";" ))

  if (bl_print){
    print(st_string_from_list)
  }
}

Testing the function with the lists created prior:

> ffi_lst2str(ls_num)
[1] "ls_num:=1;=2;=3"
> ffi_lst2str(ls_str)
[1] "ls_str:=1;=2;=3"
> ffi_lst2str(ls_num_str)
[1] "ls_num_str:=1;=2;=3"
> ffi_lst2str(ls_num_str_named)
[1] "ls_num_str_named:e1=1;e2=2;e3=3;e4=this is added"

Testing the function with subset of list elements:

> ffi_lst2str(ls_num_str_named[c('e2','e3','e4')])
[1] "ls_num_str_named[c(\"e2\", \"e3\", \"e4\")]:e2=2;e3=3;e4=this is added"
> ffi_lst2str(ls_num[2:3])
[1] "ls_num[2:3]:=2;=3"
> ffi_lst2str(ls_str[2:3])
[1] "ls_str[2:3]:=2;=3"
> ffi_lst2str(ls_num_str[2:4])
[1] "ls_num_str[2:4]:=2;=3;=NULL"
> ffi_lst2str(ls_num_str_named[c('e2','e3','e4')])
[1] "ls_num_str_named[c(\"e2\", \"e3\", \"e4\")]:e2=2;e3=3;e4=this is added"

回复收藏 0 原文

浅沫记忆 2024-08-25 14:54:44

尝试在粘贴函数中使用空的collapse参数：

paste(sdata,collapse = '')

感谢http://twitter.com/onelinetips/status/7491806343

回复收藏 0 原文

七秒鱼° 2024-08-25 14:54:44

马特的答案绝对是正确的答案。然而，这里有一个用于喜剧缓解目的的替代解决方案：

do.call(paste, c(as.list(sdata), sep = ""))

Matt's answer is definitely the right answer. However, here's an alternative solution for comic relief purposes:

do.call(paste, c(as.list(sdata), sep = ""))

回复收藏 0 原文

~没有更多了~

关于作者

难如初

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

连接字符串/字符向量

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

tidyverse

stringfish（更快）

Base R

基准测试

tidyverse

stringfish (faster)

Base R

Benchmark

关于作者

相关话题

热门标签

推荐作者

微信用户

夜夜流光相皎洁

零度℉

百度③文鱼

qq_O3Ao6frw

Wugswg

友情链接

连接字符串/字符向量

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

tidyverse

stringfish（更快）

Base R

基准测试

tidyverse

stringfish (faster)

Base R

Benchmark

关于作者

相关话题

热门标签

推荐作者

微信用户

夜夜流光相皎洁

零度℉

百度③文鱼

qq_O3Ao6frw

Wugswg

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。