逐行乘以数据框

发布于 2024-12-03 03:15:53 字数 606 浏览 1 评论 0原文

输入文件:

df1 <- data.frame(row.names=c("w","x","y","z"), 
                  A=c(0,0,0,0),
                  B=c(0,1,0,0), 
                  C=c(1,0,1,0), 
                  D=c(1,1,1,1))

  A B C D
w 0 0 1 1
x 0 1 0 1
y 0 0 1 1
z 0 0 0 1

我想应用一个方程,即将行 w 乘以行 x 以获得 wx 对的成对值,如下所示:

      A B C D
    w 0 0 1 1
X   x 0 1 0 1
--------------
   wx 0 0 0 1

获得 wx、wy、wy、wz、xy、xz、yz 的逐行分析。并生成一个包含 6 列的新数据框(两行名称后跟相乘的值)。

这就是

w x 0 0 0 1
w y 0 0 1 1
w z 0 0 0 1
x y 0 0 0 1
x z 0 0 0 1
y z 0 0 0 1

谢谢。

Input file:

df1 <- data.frame(row.names=c("w","x","y","z"), 
                  A=c(0,0,0,0),
                  B=c(0,1,0,0), 
                  C=c(1,0,1,0), 
                  D=c(1,1,1,1))

  A B C D
w 0 0 1 1
x 0 1 0 1
y 0 0 1 1
z 0 0 0 1

I want to apply an equation i.e. multiply row w to row x to get the pairwise value for w-x pair, as follows:

      A B C D
    w 0 0 1 1
X   x 0 1 0 1
--------------
   wx 0 0 0 1

to get row-wise analysis for w-x, w-y, w-y, w-z, x-y, x-z, y-z. and generate a new dataframe with 6 columns (two row names followed by the multiplied values).

That's

w x 0 0 0 1
w y 0 0 1 1
w z 0 0 0 1
x y 0 0 0 1
x z 0 0 0 1
y z 0 0 0 1

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

若无相欠,怎会相见 2024-12-10 03:15:54

如果您不希望结果对象中出现组合名称,那么我们可以结合 @DWin 和 @Owen 的答案的元素,以提供真正的矢量化方法来解决问题。 (您可以将组合名称添加为行名称,并在最后添加一个额外步骤。)

首先,数据:

dat <- read.table(con <- textConnection("  A B C D
w 0 0 1 1
x 0 1 0 1
y 0 0 1 1
z 0 0 0 1
"), header=TRUE)
close(con)

采用 @DWin 的答案中的 combn() 想法,但在行上使用它dat 的索引:

combs <- combn(seq_len(nrow(dat)), 2)

combs 的行现在对我们想要相乘的 dat 行进行索引:

> combs
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    2    2    3
[2,]    2    3    4    3    4    4

现在我们取@Owen 提出的想法,即dat[i, ] * dat[j, ] 其中 ij梳子 分别。我们使用 data.matrix() 转换为矩阵,因为这对于大型对象会更有效,但代码也可以将 dat 作为数据框使用。

mat <- data.matrix(dat)
mat[combs[1,], ] * mat[combs[2,], ]

它产生:

> mat[combs[1,], ] * mat[combs[2,], ]
  A B C D
w 0 0 0 1
w 0 0 1 1
w 0 0 0 1
x 0 0 0 1
x 0 0 0 1
y 0 0 0 1

要了解它是如何工作的,请注意 mat[combs[k,], ] 产生一个矩阵,其中各行按组合指定的顺序重复:

> mat[combs[1,], ]
  A B C D
w 0 0 1 1
w 0 0 1 1
w 0 0 1 1
x 0 1 0 1
x 0 1 0 1
y 0 0 1 1
> mat[combs[2,], ]
  A B C D
x 0 1 0 1
y 0 0 1 1
z 0 0 0 1
y 0 0 1 1
z 0 0 0 1
z 0 0 0 1

要准确获取 OP 发布的内容,我们可以使用第二个 combn() 调用来修改行名称:

> out <- mat[combs[1,], ] * mat[combs[2,], ]
> rownames(out) <- apply(combn(rownames(dat), 2), 2, paste, collapse = "")
> out
   A B C D
wx 0 0 0 1
wy 0 0 1 1
wz 0 0 0 1
xy 0 0 0 1
xz 0 0 0 1
yz 0 0 0 1

If you don't want the combo names in the resulting object, then we can combine elements of @DWin's and @Owen's Answers to provide a truly vectorised approach to the problem. (You can add the combination names as row names with one extra step at the end.)

First, the data:

dat <- read.table(con <- textConnection("  A B C D
w 0 0 1 1
x 0 1 0 1
y 0 0 1 1
z 0 0 0 1
"), header=TRUE)
close(con)

Take the combn() idea from @DWin's Answer but use it on the row indices of dat:

combs <- combn(seq_len(nrow(dat)), 2)

The rows of combs now index the rows of dat that we want to multiply together:

> combs
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    2    2    3
[2,]    2    3    4    3    4    4

Now we take the idea @Owen showed, namely dat[i, ] * dat[j, ] with i and j being the first and second rows of combs respectively. We convert to a matrix with data.matrix() as this will be more efficient for large objects, but the code will work with dat as a data frame too.

mat <- data.matrix(dat)
mat[combs[1,], ] * mat[combs[2,], ]

which produces:

> mat[combs[1,], ] * mat[combs[2,], ]
  A B C D
w 0 0 0 1
w 0 0 1 1
w 0 0 0 1
x 0 0 0 1
x 0 0 0 1
y 0 0 0 1

To see how this works, note that mat[combs[k,], ] produces a matrix with various rows repeated in the order specified by the combinations:

> mat[combs[1,], ]
  A B C D
w 0 0 1 1
w 0 0 1 1
w 0 0 1 1
x 0 1 0 1
x 0 1 0 1
y 0 0 1 1
> mat[combs[2,], ]
  A B C D
x 0 1 0 1
y 0 0 1 1
z 0 0 0 1
y 0 0 1 1
z 0 0 0 1
z 0 0 0 1

To get exactly what the OP posted, we can modify the rownames using a second combn() call:

> out <- mat[combs[1,], ] * mat[combs[2,], ]
> rownames(out) <- apply(combn(rownames(dat), 2), 2, paste, collapse = "")
> out
   A B C D
wx 0 0 0 1
wy 0 0 1 1
wz 0 0 0 1
xy 0 0 0 1
xz 0 0 0 1
yz 0 0 0 1
浅唱々樱花落 2024-12-10 03:15:54
dat <- read.table(textConnection("  A B C D
+ w 0 0 1 1
+ x 0 1 0 1
+ y 0 0 1 1
+ z 0 0 0 1
+ "), header=TRUE)
> combos <- combn(rn,2)
> combos
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] "w"  "w"  "w"  "x"  "x"  "y" 
[2,] "x"  "y"  "z"  "y"  "z"  "z" 

apply(combos,2, function(x) c(x[1], x[2], unlist(dat[x[1],]*dat[x[2],])))
  [,1] [,2] [,3] [,4] [,5] [,6]
  "w"  "w"  "w"  "x"  "x"  "y" 
  "x"  "y"  "z"  "y"  "z"  "z" 
A "0"  "0"  "0"  "0"  "0"  "0" 
B "0"  "0"  "0"  "0"  "0"  "0" 
C "0"  "1"  "0"  "0"  "0"  "0" 
D "1"  "1"  "1"  "1"  "1"  "1" 

所以最终的解决方案:

t( apply(combos,2, function(x) c(x[1], x[2], unlist(dat[x[1],]*dat[x[2],]))) )

如果将组合转换为数据帧,您还可以将 cbindmatrix 作为“数字”类型:

 cbind( as.data.frame(t(combos)), 
        t( apply(combos,2, function(x)  
                    unlist(dat[x[1],]*dat[x[2],]))) )

  V1 V2 A B C D
1  w  x 0 0 0 1
2  w  y 0 0 1 1
3  w  z 0 0 0 1
4  x  y 0 0 0 1
5  x  z 0 0 0 1
6  y  z 0 0 0 1
dat <- read.table(textConnection("  A B C D
+ w 0 0 1 1
+ x 0 1 0 1
+ y 0 0 1 1
+ z 0 0 0 1
+ "), header=TRUE)
> combos <- combn(rn,2)
> combos
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] "w"  "w"  "w"  "x"  "x"  "y" 
[2,] "x"  "y"  "z"  "y"  "z"  "z" 

apply(combos,2, function(x) c(x[1], x[2], unlist(dat[x[1],]*dat[x[2],])))
  [,1] [,2] [,3] [,4] [,5] [,6]
  "w"  "w"  "w"  "x"  "x"  "y" 
  "x"  "y"  "z"  "y"  "z"  "z" 
A "0"  "0"  "0"  "0"  "0"  "0" 
B "0"  "0"  "0"  "0"  "0"  "0" 
C "0"  "1"  "0"  "0"  "0"  "0" 
D "1"  "1"  "1"  "1"  "1"  "1" 

So the final solution:

t( apply(combos,2, function(x) c(x[1], x[2], unlist(dat[x[1],]*dat[x[2],]))) )

If you convert the combos to a dataframe you would also be able to cbindmatrix as type "numeric":

 cbind( as.data.frame(t(combos)), 
        t( apply(combos,2, function(x)  
                    unlist(dat[x[1],]*dat[x[2],]))) )

  V1 V2 A B C D
1  w  x 0 0 0 1
2  w  y 0 0 1 1
3  w  z 0 0 0 1
4  x  y 0 0 0 1
5  x  z 0 0 0 1
6  y  z 0 0 0 1
稍尽春風 2024-12-10 03:15:54

如果你想乘以行,我建议转换为矩阵:

> m = as.matrix(df1)

> m["x", ] * m["y", ]
A B C D 
0 0 0 1 

你想要的具体结果可以通过 plyr 获得,

library(plyr)

ldply(1:(nrow(m)-1), function(i)
    ldply((i+1):nrow(m), function(j) {
        a = row.names(m)[[i]]
        b = row.names(m)[[j]]

        do.call(data.frame,
            c(list(a=a, b=b), m[i,] * m[j,])
        )
    })
)

抱歉,其中一部分看起来有点神奇 - data.frames 并不是真正的意思成为“行状”。这些行

do.call(data.frame,
    c(list(a=a, b=b), m[i,] * m[j,])
)

传递 6 列:a 和 b 表示名称,连接(使用 c)到相乘的行。

If you want to multiply rows, I recommend converting to a matrix:

> m = as.matrix(df1)

> m["x", ] * m["y", ]
A B C D 
0 0 0 1 

The specific result you want you could get with plyr,

library(plyr)

ldply(1:(nrow(m)-1), function(i)
    ldply((i+1):nrow(m), function(j) {
        a = row.names(m)[[i]]
        b = row.names(m)[[j]]

        do.call(data.frame,
            c(list(a=a, b=b), m[i,] * m[j,])
        )
    })
)

Sorry part of that looks a little magical -- data.frames aren't really meant to be "row like". The lines

do.call(data.frame,
    c(list(a=a, b=b), m[i,] * m[j,])
)

pass in the 6 columns: a and b for the names, concatenated (with c) to the multiplied row.

愚人国度 2024-12-10 03:15:54

一种更短的方法(我认为)使用令人惊叹的 plyr 包

你的 data.frame

df1 <- data.frame(row.names=c("w","x","y","z"), A=c(0,0,0,0), B=c(0,1,0,0), C=c(1,0,1,0), D=c(1,1,1,1))

YOUR_COMBS<-combn(rownames(df1),2)

和你的结果:)

require(plyr) #(version 1.81...in version 1.82 you can take the annoying 'X1' index out... )


     YOUR_RESULTS<-adply(YOUR_COMBS,2,function(x) {
      tmp_row<-data.frame(Comb=paste0(x,collapse = ''),df1[x[1],]*df1[x[2],])
 })

A shorter way (I think) using the amazing plyr package

Your data.frame

df1 <- data.frame(row.names=c("w","x","y","z"), A=c(0,0,0,0), B=c(0,1,0,0), C=c(1,0,1,0), D=c(1,1,1,1))

YOUR_COMBS<-combn(rownames(df1),2)

And your result :)

require(plyr) #(version 1.81...in version 1.82 you can take the annoying 'X1' index out... )


     YOUR_RESULTS<-adply(YOUR_COMBS,2,function(x) {
      tmp_row<-data.frame(Comb=paste0(x,collapse = ''),df1[x[1],]*df1[x[2],])
 })
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文