如何将列的值除以另一列的每个值的值

发布于 2025-02-08 22:43:16 字数 1742 浏览 2 评论 0原文

我试图在R中弄清楚如何做到这一点,但真的很感谢对此的一些投入。假设我有两个dataFrames A和b:

dataframe a

a <- c("A", "A", "A", "B", "B", "B", "C", "C", "C")
b <- c(1, 5, 10, 2, 3, 8, 10, 28, 36)
c <- c(runif(9, min=5, max=99))
df_A <- data.frame(a,b,c)
names(df_A) <- c('name', 'trial', 'counts')
   
   name trial   counts
1    A     1 42.18785
2    A     5 17.17859
3    A    10 29.34961
4    B     2 23.20101
5    B     3 58.57507
6    B     8 28.94360
7    C    10 25.48171
8    C    28 55.67896
9    C    36 10.04799

dataframe b

e <- c("A", "A", "A", "B", "C", "C")
f <- c(1, 5, 10, 2, 3, 28)
g <- c(runif(6, min=5, max=99))
df_B <- data.frame(e,f,g)
names(df_B) <- c('name', 'trial', 'rate')
   
    name trial   rate
1    A     1  8.408579
2    A     5 28.029798
3    A    10 18.904179
4    B     2 20.577880
5    C     3 44.492629
6    C    28 81.408402

如您所见,这两个dataframes共享两个列,但长度有所不同。我需要做的是将计数列中的每个值除以RATE> RATE列的每个值 dataframe b 。这必须按名称为基础完成(即,group_by name列)。此后,正确的数据框将是这样的:

   name trial   counts
1    A     1 42.18785 / 8.408579
2    A     1 42.18785 / 28.029798
3    A     1 42.18785 / 18.904179
4    A     5 17.17859 / 8.408579
5    A     5 17.17859 / 28.029798
6    A     5 17.17859 / 18.904179
7    A    10 29.34961 / 8.408579
8    A    10 29.34961 / 28.029798
9    A    10 29.34961 / 18.904179
10    B     2 23.20101 / 20.577880
11    B     3 58.57507 / 20.577880
12    B     8 28.94360 / 20.577880
13    C    10 25.48171 / 44.492629
14    C    10 25.48171 / 81.408402
15    C    28 55.67896 / 44.492629
16    C    36 10.04799 / 81.408402

I am trying to figure out how to do this in R but would really appreciate some input on this. Let's say I have two dataframes, A and B:

dataframe A

a <- c("A", "A", "A", "B", "B", "B", "C", "C", "C")
b <- c(1, 5, 10, 2, 3, 8, 10, 28, 36)
c <- c(runif(9, min=5, max=99))
df_A <- data.frame(a,b,c)
names(df_A) <- c('name', 'trial', 'counts')
   
   name trial   counts
1    A     1 42.18785
2    A     5 17.17859
3    A    10 29.34961
4    B     2 23.20101
5    B     3 58.57507
6    B     8 28.94360
7    C    10 25.48171
8    C    28 55.67896
9    C    36 10.04799

dataframe B

e <- c("A", "A", "A", "B", "C", "C")
f <- c(1, 5, 10, 2, 3, 28)
g <- c(runif(6, min=5, max=99))
df_B <- data.frame(e,f,g)
names(df_B) <- c('name', 'trial', 'rate')
   
    name trial   rate
1    A     1  8.408579
2    A     5 28.029798
3    A    10 18.904179
4    B     2 20.577880
5    C     3 44.492629
6    C    28 81.408402

As you can see, these two dataframes share two columns but differ in length. What I need to do is to divide each value in the counts column by each value of the rate column in dataframe B. This has to be done on a name-by-name basis (i.e., group_by name column). A correct dataframe after this will look like this:

   name trial   counts
1    A     1 42.18785 / 8.408579
2    A     1 42.18785 / 28.029798
3    A     1 42.18785 / 18.904179
4    A     5 17.17859 / 8.408579
5    A     5 17.17859 / 28.029798
6    A     5 17.17859 / 18.904179
7    A    10 29.34961 / 8.408579
8    A    10 29.34961 / 28.029798
9    A    10 29.34961 / 18.904179
10    B     2 23.20101 / 20.577880
11    B     3 58.57507 / 20.577880
12    B     8 28.94360 / 20.577880
13    C    10 25.48171 / 44.492629
14    C    10 25.48171 / 81.408402
15    C    28 55.67896 / 44.492629
16    C    36 10.04799 / 81.408402

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

冷…雨湿花 2025-02-15 22:43:16

这是基本R解决方案。 合并数据集并将结果列划分为计数 速率。用r 4.2.0中引入的管道完成,以避免创建工作,临时data.frame。

merge(df_A, df_B[-2]) |>
  (\(x) cbind(x[1:2], counts = x[[3]]/x[[4]]))()
#>    name trial    counts
#> 1     A     1 4.9008255
#> 2     A     1 1.9812148
#> 3     A     1 0.8574978
#> 4     A     5 3.2969133
#> 5     A     5 1.3328149
#> 6     A     5 0.5768612
#> 7     A    10 0.6277524
#> 8     A    10 0.2537761
#> 9     A    10 0.1098379
#> 10    B     2 0.3528129
#> 11    B     3 4.0136321
#> 12    B     8 1.9712023
#> 13    C    10 9.7051006
#> 14    C    10 0.9257950
#> 15    C    28 2.9923193
#> 16    C    28 0.2854452
#> 17    C    36 2.2441296
#> 18    C    36 0.2140734

Here is a base R solution. merge the data sets and divide the result's columns counts by rate. Done with a pipe, introduced in R 4.2.0, to avoid the creation of a work, temporary data.frame.

merge(df_A, df_B[-2]) |>
  (\(x) cbind(x[1:2], counts = x[[3]]/x[[4]]))()
#>    name trial    counts
#> 1     A     1 4.9008255
#> 2     A     1 1.9812148
#> 3     A     1 0.8574978
#> 4     A     5 3.2969133
#> 5     A     5 1.3328149
#> 6     A     5 0.5768612
#> 7     A    10 0.6277524
#> 8     A    10 0.2537761
#> 9     A    10 0.1098379
#> 10    B     2 0.3528129
#> 11    B     3 4.0136321
#> 12    B     8 1.9712023
#> 13    C    10 9.7051006
#> 14    C    10 0.9257950
#> 15    C    28 2.9923193
#> 16    C    28 0.2854452
#> 17    C    36 2.2441296
#> 18    C    36 0.2140734

Created on 2022-06-21 by the reprex package (v2.0.1)

我不在是我 2025-02-15 22:43:16

DPLYR方法:


library(dplyr)

df_A |>    
left_join(df_B, by = "name") |>  
mutate(calc = counts / rate)

A dplyr approach:


library(dplyr)

df_A |>    
left_join(df_B, by = "name") |>  
mutate(calc = counts / rate)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文