如何根据该列中特定成员在r中的值收集列的成员

发布于 2025-01-24 09:56:45 字数 626 浏览 0 评论 0原文

在以下数据框架中,我想收集B1的成员,其中其B2中的值等于或大于B2中的“ B”值。然后,在此新信息之后,计算每个B1成员中的每个次数发生了多少次。

数据帧:

ID  B1  B2
z1  a   2.5
z1  b   1.7
z1  c   170
z1  c   9
z1  d   3
y2  a   0
y2  b   21
y2  c   15
y2  c   101
y2  d   30
y2  d   3
y2  d   15.5
x3  a   30.8
x3  a   54
x3  a   0
x3  b   30.8
x3  c   30.8
x3  d   7

因此结果将是:

ID  B1  B2
z1  a   2.5
z1  c   170
z1  c   9
z1  d   3
y2  c   101
y2  d   30
x3  a   30.8
x3  a   54
x3  c   30.8

ID  B1  count
z1  a   1
z1  c   2
z1  d   1
y2  a   0
y2  c   1
y2  d   1
x3  a   2
x3  c   1
x3  d   0

In the following data frame, I want to collect members of B1, where their value in B2 is equal to or more than the value of "b" in B2. And then after this new information, count how many times each of the B1 members occurred.

dataframe:

ID  B1  B2
z1  a   2.5
z1  b   1.7
z1  c   170
z1  c   9
z1  d   3
y2  a   0
y2  b   21
y2  c   15
y2  c   101
y2  d   30
y2  d   3
y2  d   15.5
x3  a   30.8
x3  a   54
x3  a   0
x3  b   30.8
x3  c   30.8
x3  d   7

so the result would be:

ID  B1  B2
z1  a   2.5
z1  c   170
z1  c   9
z1  d   3
y2  c   101
y2  d   30
x3  a   30.8
x3  a   54
x3  c   30.8

and

ID  B1  count
z1  a   1
z1  c   2
z1  d   1
y2  a   0
y2  c   1
y2  d   1
x3  a   2
x3  c   1
x3  d   0

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

喵星人汪星人 2025-01-31 09:56:46

由“ id”分组,过滤其中'b2'大于或等于'b2'其中'b1'是'b',并创建另一个条件,其中'b1'不等于'b'

library(dplyr)
out1 <- df1 %>%
    group_by(ID) %>% 
    filter(any(B1 == "b") & B2 >= min(B2[B1 == "b"]), B1 != 'b') 

- 输出

> out1
# A tibble: 9 × 3
# Groups:   ID [3]
  ID    B1       B2
  <chr> <chr> <dbl>
1 z1    a       2.5
2 z1    c     170  
3 z1    c       9  
4 z1    d       3  
5 y2    c     101  
6 y2    d      30  
7 x3    a      30.8
8 x3    a      54  
9 x3    c      30.8

第二个输出将通过汇总进行组进行组,以获取行数,然后用完整

library(tidyr)
out1 %>% 
  group_by(B1, .add = TRUE) %>%
  summarise(count = n(), .groups = "drop_last") %>% 
  complete(B1 = unique(.$B1), fill = list(count = 0)) %>%
  ungroup
# A tibble: 9 × 3
  ID    B1    count
  <chr> <chr> <int>
1 x3    a         2
2 x3    c         1
3 x3    d         0
4 y2    a         0
5 y2    c         1
6 y2    d         1
7 z1    a         1
8 z1    c         2
9 z1    d         1

data填充缺失的组合

df1 <- structure(list(ID = c("z1", "z1", "z1", "z1", "z1", "y2", "y2", 
"y2", "y2", "y2", "y2", "y2", "x3", "x3", "x3", "x3", "x3", "x3"
), B1 = c("a", "b", "c", "c", "d", "a", "b", "c", "c", "d", "d", 
"d", "a", "a", "a", "b", "c", "d"), B2 = c(2.5, 1.7, 170, 9, 
3, 0, 21, 15, 101, 30, 3, 15.5, 30.8, 54, 0, 30.8, 30.8, 7)), 
class = "data.frame", row.names = c(NA, 
-18L))

Grouped by 'ID', filter where the 'B2' is greater than or equal to 'B2' where 'B1' is 'b' as well as create another condition where 'B1' is not equal to 'b'

library(dplyr)
out1 <- df1 %>%
    group_by(ID) %>% 
    filter(any(B1 == "b") & B2 >= min(B2[B1 == "b"]), B1 != 'b') 

-output

> out1
# A tibble: 9 × 3
# Groups:   ID [3]
  ID    B1       B2
  <chr> <chr> <dbl>
1 z1    a       2.5
2 z1    c     170  
3 z1    c       9  
4 z1    d       3  
5 y2    c     101  
6 y2    d      30  
7 x3    a      30.8
8 x3    a      54  
9 x3    c      30.8

The second output will be do a group by with summarise to get the number of rows, and then fill the missing combinations with complete

library(tidyr)
out1 %>% 
  group_by(B1, .add = TRUE) %>%
  summarise(count = n(), .groups = "drop_last") %>% 
  complete(B1 = unique(.$B1), fill = list(count = 0)) %>%
  ungroup
# A tibble: 9 × 3
  ID    B1    count
  <chr> <chr> <int>
1 x3    a         2
2 x3    c         1
3 x3    d         0
4 y2    a         0
5 y2    c         1
6 y2    d         1
7 z1    a         1
8 z1    c         2
9 z1    d         1

data

df1 <- structure(list(ID = c("z1", "z1", "z1", "z1", "z1", "y2", "y2", 
"y2", "y2", "y2", "y2", "y2", "x3", "x3", "x3", "x3", "x3", "x3"
), B1 = c("a", "b", "c", "c", "d", "a", "b", "c", "c", "d", "d", 
"d", "a", "a", "a", "b", "c", "d"), B2 = c(2.5, 1.7, 170, 9, 
3, 0, 21, 15, 101, 30, 3, 15.5, 30.8, 54, 0, 30.8, 30.8, 7)), 
class = "data.frame", row.names = c(NA, 
-18L))
故事和酒 2025-01-31 09:56:46

使用tidyverse:

library(tidyverse)

df %>% 
  group_by(ID) %>% 
  filter(B2 > B2[B1 == "b"]) %>%
  group_by(ID, B1) %>%
  count(name = "count") %>%
  as.data.frame()
#>   ID B1 count
#> 1 x3  a     1
#> 2 y2  c     1
#> 3 y2  d     1
#> 4 z1  a     1
#> 5 z1  c     2
#> 6 z1  d     1

在2022-04-26创建的 reprex package (v2.0.1) sup>

Using tidyverse:

library(tidyverse)

df %>% 
  group_by(ID) %>% 
  filter(B2 > B2[B1 == "b"]) %>%
  group_by(ID, B1) %>%
  count(name = "count") %>%
  as.data.frame()
#>   ID B1 count
#> 1 x3  a     1
#> 2 y2  c     1
#> 3 y2  d     1
#> 4 z1  a     1
#> 5 z1  c     2
#> 6 z1  d     1

Created on 2022-04-26 by the reprex package (v2.0.1)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文