r dplyr使用group_by总结均值和stdev

发布于 2025-01-30 09:59:45 字数 1588 浏览 3 评论 0原文

我有一个看起来像这样的数据框:

df <- data.frame("Experiment" = c(rep("Exp1", 6), rep("Exp2", 5), rep("Exp3", 4)),
                 "Replicate" = c("A","A","A","B","C","C","A","A","B","B","C","A","B","B","C"),
                 "Type" = c("alpha","beta","gamma","alpha","alpha","beta","alpha","gamma","beta","gamma","beta","alpha","alpha","gamma","beta"),
                 "Frequency" = c(10,100,1000,15,5,105,10,1010,95,1020,105,15,10,990,100))

我正在尝试计算频率的均值用于实验类型的组合,而我首先通过运行此行尝试:

df %>% group_by(Experiment, Type) %>% summarise(mean = mean(Frequency), sd = sd(Frequency)

如果我运行此行,我会得到一个看起来像下面的tibble:

Experiment   Type   mean   sd
Exp1         alpha  10     5
Exp1         beta   102.   3.54
Epx1         gamma  1000   NA

但是我希望R认为所有type> (alphabetagamma)应为实验>实验replicate)存在>频率 类型的值,r将使用0而不是不包括该值。

换句话说,我想要的需要按照以下方式计算:

Experiment   Type   mean              sd
Exp1         alpha  mean(10,15,5)     sd(10,15,5)
Exp1         beta   mean(100,0,105)   sd(100,0,105)
Exp1         gamma  mean(1000,0,0)    sd(1000,0,0)

例如,对于exp1 beta总结函数我在上面使用了上述计算<代码>平均值(100,105)和SD(100,105)因为exp1 复制B在我的df中不存在。但是我希望r计算平均值(100,0,105)sd(100,0,105)而不是。有人能给我一些关于如何做到这一点的想法吗?

I have a dataframe that looks like this:

df <- data.frame("Experiment" = c(rep("Exp1", 6), rep("Exp2", 5), rep("Exp3", 4)),
                 "Replicate" = c("A","A","A","B","C","C","A","A","B","B","C","A","B","B","C"),
                 "Type" = c("alpha","beta","gamma","alpha","alpha","beta","alpha","gamma","beta","gamma","beta","alpha","alpha","gamma","beta"),
                 "Frequency" = c(10,100,1000,15,5,105,10,1010,95,1020,105,15,10,990,100))

I'm trying to calculate mean and stdev of Frequency for combination of Experiment and Type, and I first tried it by running this line:

df %>% group_by(Experiment, Type) %>% summarise(mean = mean(Frequency), sd = sd(Frequency)

If I run this, I get a tibble that looks like below:

Experiment   Type   mean   sd
Exp1         alpha  10     5
Exp1         beta   102.   3.54
Epx1         gamma  1000   NA

But I'd like R to think that all Type (alpha, beta, gamma) should exist for every combination of Experiment and Replicate, so that if there is no Frequency value for Type, R will use 0 instead of not including that value.

In other words, what I want needs to be calculated like below:

Experiment   Type   mean              sd
Exp1         alpha  mean(10,15,5)     sd(10,15,5)
Exp1         beta   mean(100,0,105)   sd(100,0,105)
Exp1         gamma  mean(1000,0,0)    sd(1000,0,0)

For example, for Exp1 beta, the summarise function I used above calculates mean(100,105) and sd(100,105) because Exp1 Replicate B doesn't exist in my df. But I want R to calculate mean(100,0,105) and sd(100,0,105) instead. Would anyone be able to give me some ideas on how to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

亣腦蒛氧 2025-02-06 09:59:45

您需要首先完成您的数据框架以用0填充缺失的数据,然后将“已完成”数据帧输送到您的功能。

library(tidyverse)

df %>% 
  complete(Experiment, Type, Replicate, fill = list(Frequency = 0)) %>% 
  group_by(Experiment, Type) %>% 
  summarise(mean = mean(Frequency), sd = sd(Frequency), .groups = "drop")

# A tibble: 9 × 4
  Experiment Type    mean     sd
  <chr>      <chr>  <dbl>  <dbl>
1 Exp1       alpha  10      5   
2 Exp1       beta   68.3   59.2 
3 Exp1       gamma 333.   577.  
4 Exp2       alpha   3.33   5.77
5 Exp2       beta   66.7   58.0 
6 Exp2       gamma 677.   586.  
7 Exp3       alpha   8.33   7.64
8 Exp3       beta   33.3   57.7 
9 Exp3       gamma 330    572.  

You need to first complete your dataframe to fill in missing data with 0, then pipe the "completed" dataframe to your functions.

library(tidyverse)

df %>% 
  complete(Experiment, Type, Replicate, fill = list(Frequency = 0)) %>% 
  group_by(Experiment, Type) %>% 
  summarise(mean = mean(Frequency), sd = sd(Frequency), .groups = "drop")

# A tibble: 9 × 4
  Experiment Type    mean     sd
  <chr>      <chr>  <dbl>  <dbl>
1 Exp1       alpha  10      5   
2 Exp1       beta   68.3   59.2 
3 Exp1       gamma 333.   577.  
4 Exp2       alpha   3.33   5.77
5 Exp2       beta   66.7   58.0 
6 Exp2       gamma 677.   586.  
7 Exp3       alpha   8.33   7.64
8 Exp3       beta   33.3   57.7 
9 Exp3       gamma 330    572.  
始终不够 2025-02-06 09:59:45

您需要在group_by功能中包含replicate,然后将输出转换为更宽的tibble。数字列可以通过替换Na值进行突变。然后,串联平均值和SD列将产生所需的输出。

df %>% group_by(Experiment, Type, Replicate) %>% 
  summarise(mean = mean(Frequency), sd = sd(Frequency)) %>% 
  pivot_wider(names_from = Replicate, values_from =  c(mean, sd)) %>% 
mutate(across(where(is.double),~ replace_na(.,0))) %>% 
  mutate(mean = paste0("mean(", mean_A, ",", mean_B, ",", mean_C, ")"),
         sd = paste0("sd(", sd_A, ",", sd_B, ",", sd_C, ")")) %>% 
  select(Experiment, Type, mean, sd)

输出是

# A tibble: 9 x 4
# Groups:   Experiment, Type [9]
  Experiment Type  mean              sd       
  <chr>      <chr> <chr>             <chr>    
1 Exp1       alpha mean(10,15,5)     sd(0,0,0)
2 Exp1       beta  mean(100,0,105)   sd(0,0,0)
3 Exp1       gamma mean(1000,0,0)    sd(0,0,0)
4 Exp2       alpha mean(10,0,0)      sd(0,0,0)
5 Exp2       beta  mean(0,95,105)    sd(0,0,0)
6 Exp2       gamma mean(1010,1020,0) sd(0,0,0)
7 Exp3       alpha mean(15,10,0)     sd(0,0,0)
8 Exp3       beta  mean(0,0,100)     sd(0,0,0)
9 Exp3       gamma mean(0,990,0)     sd(0,0,0)

You need to include Replicate in the group_by function and conver the output into a wider tibble. The number columns can be mutated by replacing NA values. Then, concatenating the mean and sd columns would give the desired output.

df %>% group_by(Experiment, Type, Replicate) %>% 
  summarise(mean = mean(Frequency), sd = sd(Frequency)) %>% 
  pivot_wider(names_from = Replicate, values_from =  c(mean, sd)) %>% 
mutate(across(where(is.double),~ replace_na(.,0))) %>% 
  mutate(mean = paste0("mean(", mean_A, ",", mean_B, ",", mean_C, ")"),
         sd = paste0("sd(", sd_A, ",", sd_B, ",", sd_C, ")")) %>% 
  select(Experiment, Type, mean, sd)

The output is

# A tibble: 9 x 4
# Groups:   Experiment, Type [9]
  Experiment Type  mean              sd       
  <chr>      <chr> <chr>             <chr>    
1 Exp1       alpha mean(10,15,5)     sd(0,0,0)
2 Exp1       beta  mean(100,0,105)   sd(0,0,0)
3 Exp1       gamma mean(1000,0,0)    sd(0,0,0)
4 Exp2       alpha mean(10,0,0)      sd(0,0,0)
5 Exp2       beta  mean(0,95,105)    sd(0,0,0)
6 Exp2       gamma mean(1010,1020,0) sd(0,0,0)
7 Exp3       alpha mean(15,10,0)     sd(0,0,0)
8 Exp3       beta  mean(0,0,100)     sd(0,0,0)
9 Exp3       gamma mean(0,990,0)     sd(0,0,0)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文