用r聚合和总结字符对象

发布于 2025-01-18 15:11:07 字数 769 浏览 1 评论 0原文

我有一个育种生产力数据集:

df1
    # Nest.box Obs.type individual.number Clutch Chick.status
    # 1 Nest1 Egg 1 First NA
    # 2 Nest1 Egg 2 First NA
    # 3 Nest1 Egg 3 First NA
    # 4 Nest2 Egg 1 First NA
    # 5 Nest2 Egg 2 First NA
    # 6 Nest2 Egg 1 First NA
    # 7 Nest1 Chick 1 First Dead
    # 8 Nest1 Chick 2 First Fledged
    # 9 Nest2 Chick 1 First Fledged
    # 10 Nest2 Chick 2 First Fledged
    # 11 Nest2 Chick 1 Second Fledged
    # 12 Nest2 Chick 2 Second UNK

我想通过 Nest.boxClutch 聚合来总结这些数据(按 Nest.box 显示“Fledged”的数量,按离合器)

想要的输出将是这样的:

output
        # Nest.box Clutch Fledged
        # 1 Nest1 First 1
        # 2 Nest2 First 2
        # 3 Nest2 Second 1

I have a breeding productivity dataset:

df1
    # Nest.box Obs.type individual.number Clutch Chick.status
    # 1 Nest1 Egg 1 First NA
    # 2 Nest1 Egg 2 First NA
    # 3 Nest1 Egg 3 First NA
    # 4 Nest2 Egg 1 First NA
    # 5 Nest2 Egg 2 First NA
    # 6 Nest2 Egg 1 First NA
    # 7 Nest1 Chick 1 First Dead
    # 8 Nest1 Chick 2 First Fledged
    # 9 Nest2 Chick 1 First Fledged
    # 10 Nest2 Chick 2 First Fledged
    # 11 Nest2 Chick 1 Second Fledged
    # 12 Nest2 Chick 2 Second UNK

I want to summarise these data by aggregating by Nest.box and Clutch (shows the number of "Fledged" by nest.box, by clutch)

The wanted output would be something like this:

output
        # Nest.box Clutch Fledged
        # 1 Nest1 First 1
        # 2 Nest2 First 2
        # 3 Nest2 Second 1

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夜空下最亮的亮点 2025-01-25 15:11:07

另一种选择是过滤,然后使用计数

library(tidyverse)

df %>% 
  filter(Chick.status == "Fledged") %>% 
  count(Nest.box, Clutch)

输出

  Nest.box Clutch n
1    Nest1  First 1
2    Nest2  First 2
3    Nest2 Second 1

数据

df <- structure(list(Nest.box = c("Nest1", "Nest1", "Nest1", "Nest2", 
"Nest2", "Nest2", "Nest1", "Nest1", "Nest2", "Nest2", "Nest2", 
"Nest2"), Obs.type = c("Egg", "Egg", "Egg", "Egg", "Egg", "Egg", 
"Chick", "Chick", "Chick", "Chick", "Chick", "Chick"), individual.number = c(1L, 
2L, 3L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L), Clutch = c("First", 
"First", "First", "First", "First", "First", "First", "First", 
"First", "First", "Second", "Second"), Chick.status = c(NA, NA, 
NA, NA, NA, NA, "Dead", "Fledged", "Fledged", "Fledged", "Fledged", 
"UNK")), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6", "7", "8", "9", "10", "11", "12"))

Another option is to filter then use count:

library(tidyverse)

df %>% 
  filter(Chick.status == "Fledged") %>% 
  count(Nest.box, Clutch)

Output

  Nest.box Clutch n
1    Nest1  First 1
2    Nest2  First 2
3    Nest2 Second 1

Data

df <- structure(list(Nest.box = c("Nest1", "Nest1", "Nest1", "Nest2", 
"Nest2", "Nest2", "Nest1", "Nest1", "Nest2", "Nest2", "Nest2", 
"Nest2"), Obs.type = c("Egg", "Egg", "Egg", "Egg", "Egg", "Egg", 
"Chick", "Chick", "Chick", "Chick", "Chick", "Chick"), individual.number = c(1L, 
2L, 3L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L), Clutch = c("First", 
"First", "First", "First", "First", "First", "First", "First", 
"First", "First", "Second", "Second"), Chick.status = c(NA, NA, 
NA, NA, NA, NA, "Dead", "Fledged", "Fledged", "Fledged", "Fledged", 
"UNK")), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6", "7", "8", "9", "10", "11", "12"))
橘寄 2025-01-25 15:11:07

这是一个潜在的解决方案:

library(dplyr)

df <- read.table(text = "Nest.box Obs.type individual.number Clutch Chick.status
1 Nest1 Egg 1 First NA
2 Nest1 Egg 2 First NA
3 Nest1 Egg 3 First NA
4 Nest2 Egg 1 First NA
5 Nest2 Egg 2 First NA
6 Nest2 Egg 1 First NA
7 Nest1 Chick 1 First Dead
8 Nest1 Chick 2 First Fledged
9 Nest2 Chick 1 First Fledged
10 Nest2 Chick 2 First Fledged
11 Nest2 Chick 1 Second Fledged
12 Nest2 Chick 2 Second UNK", header = TRUE)

df %>%
  group_by(Nest.box, Clutch) %>%
  summarise(Fledged = sum(Chick.status == "Fledged", na.rm = TRUE))

#> # A tibble: 3 × 3
#> # Groups:   Nest.box [2]
#>   Nest.box Clutch Fledged
#>   <chr>    <chr>    <int>
#> 1 Nest1    First        1
#> 2 Nest2    First        2
#> 3 Nest2    Second       1

在2022-04-04创建的 reprex package (v2.0.1.1) )

Here is a potential solution:

library(dplyr)

df <- read.table(text = "Nest.box Obs.type individual.number Clutch Chick.status
1 Nest1 Egg 1 First NA
2 Nest1 Egg 2 First NA
3 Nest1 Egg 3 First NA
4 Nest2 Egg 1 First NA
5 Nest2 Egg 2 First NA
6 Nest2 Egg 1 First NA
7 Nest1 Chick 1 First Dead
8 Nest1 Chick 2 First Fledged
9 Nest2 Chick 1 First Fledged
10 Nest2 Chick 2 First Fledged
11 Nest2 Chick 1 Second Fledged
12 Nest2 Chick 2 Second UNK", header = TRUE)

df %>%
  group_by(Nest.box, Clutch) %>%
  summarise(Fledged = sum(Chick.status == "Fledged", na.rm = TRUE))

#> # A tibble: 3 × 3
#> # Groups:   Nest.box [2]
#>   Nest.box Clutch Fledged
#>   <chr>    <chr>    <int>
#> 1 Nest1    First        1
#> 2 Nest2    First        2
#> 3 Nest2    Second       1

Created on 2022-04-04 by the reprex package (v2.0.1)

我做我的改变 2025-01-25 15:11:07
library(dplyr)    
df2 <- df1 %>%
  distinct() %>%
  group_by(Next.box, Clutch) %>%
  tally() %>%
  ungroup()
library(dplyr)    
df2 <- df1 %>%
  distinct() %>%
  group_by(Next.box, Clutch) %>%
  tally() %>%
  ungroup()
白昼 2025-01-25 15:11:07

使用 data.table:

library(data.table)
setDT(df)[Chick.status=="Fledged", .N, by=.(Nest.box, Clutch)]

##    Nest.box Clutch N
## 1:    Nest1  First 1
## 2:    Nest2  First 2
## 3:    Nest2 Second 1

这会将 df 转换为 data.table (setDT(df)),对 Chick.status=='Fledged' 进行过滤>,并对按 Nest.boxClutch 分组的行进行计数 (.N)。

Using data.table:

library(data.table)
setDT(df)[Chick.status=="Fledged", .N, by=.(Nest.box, Clutch)]

##    Nest.box Clutch N
## 1:    Nest1  First 1
## 2:    Nest2  First 2
## 3:    Nest2 Second 1

This converts df to a data.table (setDT(df)), filters on Chick.status=='Fledged', and counts (.N) the rows grouped by Nest.box and Clutch.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文