为什么 sapply 返回一个我需要转置的矩阵,然后转置后的矩阵不会附加到数据帧?

发布于 2024-10-01 20:30:03 字数 978 浏览 7 评论 0原文

我希望能深入了解为什么会发生这种情况,以及如何更雄辩地做到这一点。

当我使用 sapply 时,我希望它返回一个 3x2 矩阵,但它返回一个 2x3 矩阵。这是为什么呢?为什么很难将其附加到另一个数据框?

a <- data.frame(id=c('a','b','c'), var1 = c(1,2,3), var2 = c(3,2,1))
out <- sapply(a$id, function(x) out = a[x, c('var1', 'var2')])
#out is 3x2, but I would like it to be 2x3
#I then want to append t(out) (out as a 2x3 matrix) to b, a 1x3 dataframe
b <- data.frame(var3=c(0,0,0))

当我尝试附加这些时,

b[,c('col2','col3')] <- t(out)

我得到的错误是:

Warning message:
In `[<-.data.frame`(`*tmp*`, , c("col2", "col3"), value = list(1,  :
  provided 6 variables to replace 2 variables

尽管以下内容似乎给出了所需的结果:

rownames(out) <- c('col1', 'col2')
b <- cbind(b, t(out))

我无法对变量进行操作:

b$var1/b$var2

返回

Error in b$var1/b$var2 : non-numeric argument to binary operator

谢谢!

I would appreciate insight into why this happens and how I might do this more eloquently.

When I use sapply, I would like it to return a 3x2 matrix, but it returns a 2x3 matrix. Why is this? And why is it difficult to attach this to another data frame?

a <- data.frame(id=c('a','b','c'), var1 = c(1,2,3), var2 = c(3,2,1))
out <- sapply(a$id, function(x) out = a[x, c('var1', 'var2')])
#out is 3x2, but I would like it to be 2x3
#I then want to append t(out) (out as a 2x3 matrix) to b, a 1x3 dataframe
b <- data.frame(var3=c(0,0,0))

when I try to attach these,

b[,c('col2','col3')] <- t(out)

The error that I get is:

Warning message:
In `[<-.data.frame`(`*tmp*`, , c("col2", "col3"), value = list(1,  :
  provided 6 variables to replace 2 variables

although the following appears to give the desired result:

rownames(out) <- c('col1', 'col2')
b <- cbind(b, t(out))

I can not operate on the variables:

b$var1/b$var2

returns

Error in b$var1/b$var2 : non-numeric argument to binary operator

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

娇柔作态 2024-10-08 20:30:03

扩展 DWin 的答案:查看 out 对象的结构会有所帮助。它解释了为什么 b$var1/b$var2 不符合您的预期。

> out <- sapply(a$id, function(x) out = a[x, c('var1', 'var2')])
> str(out)  # this isn't a data.frame or a matrix...
List of 6
 $ : num 1
 $ : num 3
 $ : num 2
 $ : num 2
 $ : num 3
 $ : num 1
 - attr(*, "dim")= int [1:2] 2 3
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "var1" "var2"
  ..$ : NULL

apply 系列函数设计用于处理向量和数组,因此在将它们与 data.frames(通常是向量列表)一起使用时需要小心。您可以通过 lapply 利用 data.frames 是列表这一事实来发挥您的优势。

> out <- lapply(a$id, function(x) a[x, c('var1', 'var2')])  # list of data.frames
> out <- do.call(rbind, out) # data.frame
> b <- cbind(b,out)
> str(b)
'data.frame':   3 obs. of  4 variables:
 $ var3: num  0 0 0
 $ var1: num  1 2 3
 $ var2: num  3 2 1
 $ var3: num  0 0 0
> b$var1/b$var2
[1] 0.3333333 1.0000000 3.0000000

To expand on DWin's answer: it would help to look at the structure of your out object. It explains why b$var1/b$var2 doesn't do what you expect.

> out <- sapply(a$id, function(x) out = a[x, c('var1', 'var2')])
> str(out)  # this isn't a data.frame or a matrix...
List of 6
 $ : num 1
 $ : num 3
 $ : num 2
 $ : num 2
 $ : num 3
 $ : num 1
 - attr(*, "dim")= int [1:2] 2 3
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "var1" "var2"
  ..$ : NULL

The apply family of functions are designed to work on vectors and arrays, so you need to take care when using them with data.frames (which are usually lists of vectors). You can use the fact that data.frames are lists to your advantage with lapply.

> out <- lapply(a$id, function(x) a[x, c('var1', 'var2')])  # list of data.frames
> out <- do.call(rbind, out) # data.frame
> b <- cbind(b,out)
> str(b)
'data.frame':   3 obs. of  4 variables:
 $ var3: num  0 0 0
 $ var1: num  1 2 3
 $ var2: num  3 2 1
 $ var3: num  0 0 0
> b$var1/b$var2
[1] 0.3333333 1.0000000 3.0000000
酷炫老祖宗 2024-10-08 20:30:03

首先是一些 R 符号。如果您查看 sapply 的代码,您就会找到问题的答案。 sapply 函数检查列表长度是否都相等,如果是,它首先“unlist()”它们,然后将该系列列表作为 array 的数据参数()。由于 array (如 matrix() )默认情况下按列主要顺序排列其值,这就是您所得到的。名单被翻转了。如果您不喜欢它,那么您可以定义一个新函数tsapply,它将返回转置值:

> tsapply <- function(...) t(sapply(...))
> out <- tsapply(a$id, function(x) out = a[x, c('var1', 'var2')])
> out
     var1 var2
[1,] 1    3   
[2,] 2    2   
[3,] 3    1 

...一个 3 x 2 矩阵。

First a bit of R notation. The If you look at the code for sapply, you will find the answer to your question. The sapply function checks to see if the list lengths are all equal, and if so, it first "unlist()"s them and then takes that series of lists as the data argument to array(). Since array (like matrix() ) by default arranges its values in column major order, that is what you get. The lists get turned on their side. If you don't like it then you can define a new function tsapply that will return the transposed values:

> tsapply <- function(...) t(sapply(...))
> out <- tsapply(a$id, function(x) out = a[x, c('var1', 'var2')])
> out
     var1 var2
[1,] 1    3   
[2,] 2    2   
[3,] 3    1 

... a 3 x 2 matrix.

清风挽心 2024-10-08 20:30:03

查看 plyr 包中的 ddply

a <- data.frame(id=c('a','b','c'), var1 = c(1,2,3), var2 = c(3,2,1))

library(plyr)
ddply(a, "id", function(x){
    out <- cbind(O1 = rnorm(nrow(x), x$var1), O2 = runif(nrow(x)))
    out
})

Have a look at ddply from the plyr package

a <- data.frame(id=c('a','b','c'), var1 = c(1,2,3), var2 = c(3,2,1))

library(plyr)
ddply(a, "id", function(x){
    out <- cbind(O1 = rnorm(nrow(x), x$var1), O2 = runif(nrow(x)))
    out
})
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文