如何从 R 中的向量列表创建矩阵?

发布于 2024-08-03 18:04:35 字数 819 浏览 5 评论 0原文

目标:从相等长度的向量列表中创建一个矩阵,其中每个向量成为一行。

示例:

> a <- list()
> for (i in 1:10) a[[i]] <- c(i,1:5)
> a
[[1]]
[1] 1 1 2 3 4 5

[[2]]
[1] 2 1 2 3 4 5

[[3]]
[1] 3 1 2 3 4 5

[[4]]
[1] 4 1 2 3 4 5

[[5]]
[1] 5 1 2 3 4 5

[[6]]
[1] 6 1 2 3 4 5

[[7]]
[1] 7 1 2 3 4 5

[[8]]
[1] 8 1 2 3 4 5

[[9]]
[1] 9 1 2 3 4 5

[[10]]
[1] 10  1  2  3  4  5

我想要:

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5 

Goal: from a list of vectors of equal length, create a matrix where each vector becomes a row.

Example:

> a <- list()
> for (i in 1:10) a[[i]] <- c(i,1:5)
> a
[[1]]
[1] 1 1 2 3 4 5

[[2]]
[1] 2 1 2 3 4 5

[[3]]
[1] 3 1 2 3 4 5

[[4]]
[1] 4 1 2 3 4 5

[[5]]
[1] 5 1 2 3 4 5

[[6]]
[1] 6 1 2 3 4 5

[[7]]
[1] 7 1 2 3 4 5

[[8]]
[1] 8 1 2 3 4 5

[[9]]
[1] 9 1 2 3 4 5

[[10]]
[1] 10  1  2  3  4  5

I want:

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

子栖 2024-08-10 18:04:35

一种选择是使用 do.call():

 > do.call(rbind, a)
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5

One option is to use do.call():

 > do.call(rbind, a)
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
饮惑 2024-08-10 18:04:35

simplify2array 是一个相当直观的基本函数。但是,由于 R 的默认设置是首先按列填充数据,因此您需要转置输出。 (sapply 使用 simplify2array,如 help(sapply) 中所述。)

> t(simplify2array(a))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5

simplify2array is a base function that is fairly intuitive. However, since R's default is to fill in data by columns first, you will need to transpose the output. (sapply uses simplify2array, as documented in help(sapply).)

> t(simplify2array(a))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
肤浅与狂妄 2024-08-10 18:04:35

内置的matrix函数有一个很好的选项来byrow输入数据。将其与源列表中的unlist 结合起来将为您提供一个矩阵。我们还需要指定行数,以便分解未列出的数据。那是:

> matrix(unlist(a), byrow=TRUE, nrow=length(a) )
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5

The built-in matrix function has the nice option to enter data byrow. Combine that with an unlist on your source list will give you a matrix. We also need to specify the number of rows so it can break up the unlisted data. That is:

> matrix(unlist(a), byrow=TRUE, nrow=length(a) )
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
伴梦长久 2024-08-10 18:04:35

并不简单,但它有效:

> t(sapply(a, unlist))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5

Not straightforward, but it works:

> t(sapply(a, unlist))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
清晰传感 2024-08-10 18:04:35
t(sapply(a, '[', 1:max(sapply(a, length))))

其中“a”是一个列表。
适用于不相等的行大小

t(sapply(a, '[', 1:max(sapply(a, length))))

where 'a' is a list.
Would work for unequal row size

不再让梦枯萎 2024-08-10 18:04:35
> library(plyr)
> as.matrix(ldply(a))
      V1 V2 V3 V4 V5 V6
 [1,]  1  1  2  3  4  5
 [2,]  2  1  2  3  4  5
 [3,]  3  1  2  3  4  5
 [4,]  4  1  2  3  4  5
 [5,]  5  1  2  3  4  5
 [6,]  6  1  2  3  4  5
 [7,]  7  1  2  3  4  5
 [8,]  8  1  2  3  4  5
 [9,]  9  1  2  3  4  5
[10,] 10  1  2  3  4  5
> library(plyr)
> as.matrix(ldply(a))
      V1 V2 V3 V4 V5 V6
 [1,]  1  1  2  3  4  5
 [2,]  2  1  2  3  4  5
 [3,]  3  1  2  3  4  5
 [4,]  4  1  2  3  4  5
 [5,]  5  1  2  3  4  5
 [6,]  6  1  2  3  4  5
 [7,]  7  1  2  3  4  5
 [8,]  8  1  2  3  4  5
 [9,]  9  1  2  3  4  5
[10,] 10  1  2  3  4  5
静水深流 2024-08-10 18:04:35

如果您的列表元素大小不等或者您实际上想要一个 data.frame ,那么 data.table::transpose(a) 可能是一个有用的工具。

它有效地将长度为 n 的长度为 p 的向量列表转换为长度为 p 的长度为 n 的向量列表,并用您选择的值填充缺失的元素。

# For list of vectors of unequal size if you want to pad instead of recycle
a <- sapply(1:6, function(i) c(i, seq_len(i)))
a
#> [[1]]
#> [1] 1 1
#> 
#> [[2]]
#> [1] 2 1 2
#> 
#> [[3]]
#> [1] 3 1 2 3
#> 
#> [[4]]
#> [1] 4 1 2 3 4
#> 
#> [[5]]
#> [1] 5 1 2 3 4 5
#> 
#> [[6]]
#> [1] 6 1 2 3 4 5 6

matrix(unlist(data.table::transpose(a)), nrow=length(a))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,]    1    1   NA   NA   NA   NA   NA
#> [2,]    2    1    2   NA   NA   NA   NA
#> [3,]    3    1    2    3   NA   NA   NA
#> [4,]    4    1    2    3    4   NA   NA
#> [5,]    5    1    2    3    4    5   NA
#> [6,]    6    1    2    3    4    5    6
#
## neat if you want a data.frame instead
data.table::setDF(data.table::as.data.table(data.table::transpose(a)))[]
#>   V1 V2 V3 V4 V5 V6 V7
#> 1  1  1 NA NA NA NA NA
#> 2  2  1  2 NA NA NA NA
#> 3  3  1  2  3 NA NA NA
#> 4  4  1  2  3  4 NA NA
#> 5  5  1  2  3  4  5 NA
#> 6  6  1  2  3  4  5  6

它几乎与 Matrix(unlist( ), byrow=TRUE) 解决方案一样快,并且比 t(sapply() 方法快得多也适用于不等长度。

a <- sapply(1:6, function(i) c(i, seq_len(i)))
a
bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length))))
)
#> # A tibble: 2 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))   6.87µs   8.68µs
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 33.29µs  42.14µs
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>



# small list, equal sizes
a <- sapply(1:6, function(i) c(i, seq_len(5)), simplify = FALSE)
a
#> [[1]]
#> [1] 1 1 2 3 4 5
#> 
#> [[2]]
#> [1] 2 1 2 3 4 5
#> 
#> [[3]]
#> [1] 3 1 2 3 4 5
#> 
#> [[4]]
#> [1] 4 1 2 3 4 5
#> 
#> [[5]]
#> [1] 5 1 2 3 4 5
#> 
#> [[6]]
#> [1] 6 1 2 3 4 5

bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length)))),
  do.call(rbind, a),
  matrix(unlist(a), byrow=TRUE, nrow=length(a) )
)
#> # A tibble: 4 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))   7.03µs   9.06µs
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 32.99µs  36.18µs
#> 3 do.call(rbind, a)                                            2.92µs   3.47µs
#> 4 matrix(unlist(a), byrow = TRUE, nrow = length(a))            2.77µs   3.07µs
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>


# large list, equal sizes
a <- sapply(seq_len(100000), function(i) c(i, seq_len(5)), simplify = FALSE)

bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length)))),
  do.call(rbind, a),
  matrix(unlist(a), byrow=TRUE, nrow=length(a) )
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))  11.62ms  12.54ms
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 94.56ms 101.09ms
#> 3 do.call(rbind, a)                                           59.02ms  70.49ms
#> 4 matrix(unlist(a), byrow = TRUE, nrow = length(a))            7.02ms   7.82ms
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>

data.table::transpose(a) can be a useful tool here if you your list elements have unequal size or you actually wanted a data.frame instead.

It efficiently turns a length-n list of length-up-to-p vectors into a length-p list of length-n vectors, padding the missing elements with a value of your choice.

# For list of vectors of unequal size if you want to pad instead of recycle
a <- sapply(1:6, function(i) c(i, seq_len(i)))
a
#> [[1]]
#> [1] 1 1
#> 
#> [[2]]
#> [1] 2 1 2
#> 
#> [[3]]
#> [1] 3 1 2 3
#> 
#> [[4]]
#> [1] 4 1 2 3 4
#> 
#> [[5]]
#> [1] 5 1 2 3 4 5
#> 
#> [[6]]
#> [1] 6 1 2 3 4 5 6

matrix(unlist(data.table::transpose(a)), nrow=length(a))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,]    1    1   NA   NA   NA   NA   NA
#> [2,]    2    1    2   NA   NA   NA   NA
#> [3,]    3    1    2    3   NA   NA   NA
#> [4,]    4    1    2    3    4   NA   NA
#> [5,]    5    1    2    3    4    5   NA
#> [6,]    6    1    2    3    4    5    6
#
## neat if you want a data.frame instead
data.table::setDF(data.table::as.data.table(data.table::transpose(a)))[]
#>   V1 V2 V3 V4 V5 V6 V7
#> 1  1  1 NA NA NA NA NA
#> 2  2  1  2 NA NA NA NA
#> 3  3  1  2  3 NA NA NA
#> 4  4  1  2  3  4 NA NA
#> 5  5  1  2  3  4  5 NA
#> 6  6  1  2  3  4  5  6

It is almost as fast as the matrix(unlist( ), byrow=TRUE) solution and much faster than the t(sapply( approach that also works for unequal lengths.

a <- sapply(1:6, function(i) c(i, seq_len(i)))
a
bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length))))
)
#> # A tibble: 2 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))   6.87µs   8.68µs
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 33.29µs  42.14µs
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>



# small list, equal sizes
a <- sapply(1:6, function(i) c(i, seq_len(5)), simplify = FALSE)
a
#> [[1]]
#> [1] 1 1 2 3 4 5
#> 
#> [[2]]
#> [1] 2 1 2 3 4 5
#> 
#> [[3]]
#> [1] 3 1 2 3 4 5
#> 
#> [[4]]
#> [1] 4 1 2 3 4 5
#> 
#> [[5]]
#> [1] 5 1 2 3 4 5
#> 
#> [[6]]
#> [1] 6 1 2 3 4 5

bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length)))),
  do.call(rbind, a),
  matrix(unlist(a), byrow=TRUE, nrow=length(a) )
)
#> # A tibble: 4 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))   7.03µs   9.06µs
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 32.99µs  36.18µs
#> 3 do.call(rbind, a)                                            2.92µs   3.47µs
#> 4 matrix(unlist(a), byrow = TRUE, nrow = length(a))            2.77µs   3.07µs
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>


# large list, equal sizes
a <- sapply(seq_len(100000), function(i) c(i, seq_len(5)), simplify = FALSE)

bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length)))),
  do.call(rbind, a),
  matrix(unlist(a), byrow=TRUE, nrow=length(a) )
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))  11.62ms  12.54ms
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 94.56ms 101.09ms
#> 3 do.call(rbind, a)                                           59.02ms  70.49ms
#> 4 matrix(unlist(a), byrow = TRUE, nrow = length(a))            7.02ms   7.82ms
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文