将 data.frame 的每个单元格与其权重相乘

发布于 2024-10-18 20:27:23 字数 371 浏览 4 评论 0原文

我想做的事情很简单——但我失败了。

我有一个包含“字符”和“数字”的 data.frame。 data.frame 的其中一列代表权重。

我想将数据框的每个单元格乘以相应的权重(如果它是数字)。

我该怎么做(最好不使用嵌套循环)。

先感谢您!

示例:

   c1   c2   w   
l1 abc  2    1
l2 dxf  3    0.5
l3 ghi  4    1.5

应该成为

   c1   c2   w   
l1 abc  2    1
l2 dxf  1.5  0.5
l3 ghi  6    1.5

What I want to do is embarrassing simple - nevertheless I fail.

I have a data.frame with "characters" and "numerics". One of the columns of the data.frame represents the weights.

I want to multiply every cell of the data frame with the corresponding weight (if it's a numeric).

How do I do that (best without using a nested loop).

Thank you in advance!

Example:

   c1   c2   w   
l1 abc  2    1
l2 dxf  3    0.5
l3 ghi  4    1.5

should become

   c1   c2   w   
l1 abc  2    1
l2 dxf  1.5  0.5
l3 ghi  6    1.5

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

触ぅ动初心 2024-10-25 20:27:23

对于可重现的示例,dd 是一个混合变量类型的数据帧,W 是权重。

dd <- data.frame(G=gl(2,2), X=rnorm(4), Y=1L:4L, Z=letters[1:4], W=0.3:3.3)
num.vars <- names(dd)[sapply(dd, is.numeric)]  #select numeric variables
num.vars <- setdiff(num.vars, "W")  # remove the weight variable
dd[num.vars] <- dd[num.vars] * dd$W  # multiply

For a reproducible example, dd is a data frame with a mixture of variable types, with W being the weights.

dd <- data.frame(G=gl(2,2), X=rnorm(4), Y=1L:4L, Z=letters[1:4], W=0.3:3.3)
num.vars <- names(dd)[sapply(dd, is.numeric)]  #select numeric variables
num.vars <- setdiff(num.vars, "W")  # remove the weight variable
dd[num.vars] <- dd[num.vars] * dd$W  # multiply
温折酒 2024-10-25 20:27:23

矢量化!

> dat <- data.frame(c1 = c("abc","dxf","ghi"), c2 = 2:4, w = c(1,0.5,1.5))

实际上,您需要 c2 * w,但我们需要告诉 R 查看数据框的内部

> with(dat, c2 * w)
[1] 2.0 1.5 6.0

我们可以将其插回到 dat 中一行:(

> dat <- within(dat, c3 <- c2 * w)
> dat
   c1 c2   w  c3
1 abc  2 1.0 2.0
2 dxf  3 0.5 1.5
3 ghi  4 1.5 6.0

如果您想覆盖现有的 c2,请将 c3 替换为 c2。)

如果您有多个数字列,除了权重,如果您想自动化它,则需要稍微不同的策略(即不告诉 R 哪些列要乘以 w)。

> ## dummy data
> dat2 <- data.frame(c1 = c("abc","dxf","ghi"), c2 = 2:4, w = c(1,0.5,1.5),
                     c3 = 5:7, c4 = 3:5)
> ## select the columns we want, all numerics, but not `w`
> want <- sapply(dat2, is.numeric) & names(dat2) != "w"
> ## then use want to index into dat2
> dat2[, want] <- with(dat2, dat2[, want] * w)
> dat2
   c1  c2   w   c3  c4
1 abc 2.0 1.0  5.0 3.0
2 dxf 1.5 0.5  3.0 2.0
3 ghi 6.0 1.5 10.5 7.5

Vectorise!

> dat <- data.frame(c1 = c("abc","dxf","ghi"), c2 = 2:4, w = c(1,0.5,1.5))

Effectively, you want c2 * w, but we need to tell R to look inside the data frame:

> with(dat, c2 * w)
[1] 2.0 1.5 6.0

Which we can insert back into dat in a single line:

> dat <- within(dat, c3 <- c2 * w)
> dat
   c1 c2   w  c3
1 abc  2 1.0 2.0
2 dxf  3 0.5 1.5
3 ghi  4 1.5 6.0

(Replace c3 with c2 if you want to overwrite the existing c2.)

If you have more than one numeric column other than weights, a slighlty different strategy is required if you want to automate it (i.e. not tell R which columns to multiply by w).

> ## dummy data
> dat2 <- data.frame(c1 = c("abc","dxf","ghi"), c2 = 2:4, w = c(1,0.5,1.5),
                     c3 = 5:7, c4 = 3:5)
> ## select the columns we want, all numerics, but not `w`
> want <- sapply(dat2, is.numeric) & names(dat2) != "w"
> ## then use want to index into dat2
> dat2[, want] <- with(dat2, dat2[, want] * w)
> dat2
   c1  c2   w   c3  c4
1 abc 2.0 1.0  5.0 3.0
2 dxf 1.5 0.5  3.0 2.0
3 ghi 6.0 1.5 10.5 7.5
め七分饶幸 2024-10-25 20:27:23

只是为了尝试将其写成一行(但实际上不是最具可读性的!):

R> dd <- data.frame(G=gl(2,2), X=rnorm(4), Y=1L:4L, Z=letters[1:4], W=0.3:3.3)
R> dd
  G         X Y Z   W
1 1 0.2319565 1 a 0.3
2 1 0.4242205 2 b 1.3
3 2 0.5218064 3 c 2.3
4 2 0.7155944 4 d 3.3

R> data.frame(lapply(subset(dd, select=-W), function(v, w=dd$W) { if (is.numeric(v)) v*w else v }), W=dd$W)
  G          X    Y Z   W
1 1 0.06958695  0.3 a 0.3
2 1 0.55148670  2.6 b 1.3
3 2 1.20015475  6.9 c 2.3
4 2 2.36146163 13.2 d 3.3

Just for the pleasure to try to make it in one line (but really not the most readable !) :

R> dd <- data.frame(G=gl(2,2), X=rnorm(4), Y=1L:4L, Z=letters[1:4], W=0.3:3.3)
R> dd
  G         X Y Z   W
1 1 0.2319565 1 a 0.3
2 1 0.4242205 2 b 1.3
3 2 0.5218064 3 c 2.3
4 2 0.7155944 4 d 3.3

R> data.frame(lapply(subset(dd, select=-W), function(v, w=dd$W) { if (is.numeric(v)) v*w else v }), W=dd$W)
  G          X    Y Z   W
1 1 0.06958695  0.3 a 0.3
2 1 0.55148670  2.6 b 1.3
3 2 1.20015475  6.9 c 2.3
4 2 2.36146163 13.2 d 3.3
浴红衣 2024-10-25 20:27:23

正如您所看到的,有多种方法可以做到这一点,但不知何故,您会期望一种非常简单的方法,而我不知道这种方法是否存在。 plyr 包中有一个名为 colwise 的库函数,它很接近,但我无法想出一个干净的方法来让它完全按照你想要的方式去做。我可以用 colwise 做的最好的事情是(假设您的数据框名为 df):
<代码>

w2<-df$w df<-colwise(函数(x,w){if(is.numeric(x)){x*w} else{x}})(df,df$w) df$w<-w2

<代码>

对于那些熟悉 colwise 的人,我认为您不能简单地使用 numcolwise,因为这样根本不会发出非数字列。我想不出任何干净的方法来不将乘法应用于重量,这就是为什么我只是在这里保存和恢复它。我认为如果可以找到一种更简洁的方法来做到这一点,那么 colwise 是一种很好的简单且易于理解的方法。

As you have seen, there a number of ways to do this, but somehow you'd expect one really simple way and I don't know if that exists. There is a library function in the plyr package called colwise that is close, but I can't come up with a clean way to get it to do exactly what you want. The best I can do wtih colwise is this (assuming your dataframe is named df):

w2<-df$w df<-colwise(function(x,w){if(is.numeric(x)){x*w} else{x}})(df,df$w) df$w<-w2

For those who are familiar with colwise, I don't think you can simply use numcolwise because then the non-numeric columns are not emitted at all. And I can't figure out any clean way to not have the multiplication appled to the weight, which is why I simply save and restore it here. I think if a cleaner way of doing this can be worked out, colwise is a nice simlpe and easy to understand way to do this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文