如何过滤嵌套数据

发布于 2025-02-09 04:41:25 字数 896 浏览 5 评论 0原文

如何过滤嵌套的数据集(确保巢与某些参考向量或tibble完全相同)?

library(tidyverse)

rev_vec <-  c("apple", "pear", "banana")

df <- tibble(
  ID= rep(1:3, each =3),
  fruits =  c("apple", "pear", "banana", 
              "Pineapple", "Pineapple", "orange",
              "lime", "pear", NA))

df_vec <- df %>% 
  group_by(ID) %>% 
  summarise(fruits  = list(unique(fruits)))

## This does not work
df_vec %>% 
  filter(fruits == rev_vec)

## This does not work
df_vec %>% 
  filter(unlist(fruits) == rev_vec)

## This does not work
df_vec %>% 
  filter(all(unlist(fruits[[1]]) ==rev_vec))

基本上,我只需要知道哪个ID(在这种情况下1)匹配参考向量

预期结果

ID 1与Rev Vec匹配。

df_vec %>%
   filter(....)
# A tibble: 1 x 2
     ID fruits   
  <int> <list>   
1     1 <chr [3]>

How can I filter a nested dataset (make sure the nest is the exact same as some reference vector or tibble)?

library(tidyverse)

rev_vec <-  c("apple", "pear", "banana")

df <- tibble(
  ID= rep(1:3, each =3),
  fruits =  c("apple", "pear", "banana", 
              "Pineapple", "Pineapple", "orange",
              "lime", "pear", NA))

df_vec <- df %>% 
  group_by(ID) %>% 
  summarise(fruits  = list(unique(fruits)))

## This does not work
df_vec %>% 
  filter(fruits == rev_vec)

## This does not work
df_vec %>% 
  filter(unlist(fruits) == rev_vec)

## This does not work
df_vec %>% 
  filter(all(unlist(fruits[[1]]) ==rev_vec))

Basically, I just need to know which ID (in this case 1) matches the reference vector

expected outcome

Only ID 1 matches the rev vec.

df_vec %>%
   filter(....)
# A tibble: 1 x 2
     ID fruits   
  <int> <list>   
1     1 <chr [3]>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

赠我空喜 2025-02-16 04:41:25
df_vec %>% 
    filter(map_lgl(fruits, ~setequal(., rev_vec)))

# A tibble: 1 x 2
     ID fruits   
  <int> <list>   
1     1 <chr [3]>
df_vec %>% 
    filter(map_lgl(fruits, ~setequal(., rev_vec)))

# A tibble: 1 x 2
     ID fruits   
  <int> <list>   
1     1 <chr [3]>
┈┾☆殇 2025-02-16 04:41:25

不确定您希望输出结构化,但这是

library(dplyr)

df %>% 
 group_by(ID) %>% 
 mutate(new = sum(fruits %in% rev_vec) == n())

# A tibble: 9 x 3
# Groups:   ID [3]
     ID fruits    new  
  <int> <chr>     <lgl>
1     1 apple     TRUE 
2     1 pear      TRUE 
3     1 banana    TRUE 
4     2 Pineapple FALSE
5     2 Pineapple FALSE
6     2 orange    FALSE
7     3 lime      FALSE
8     3 pear      FALSE
9     3 NA        FALSE

另一个想法,

df %>% 
 group_by(ID) %>% 
 mutate(new = sum(fruits %in% rev_vec) == n()) %>% 
 filter(new) %>% 
 nest()

# A tibble: 1 x 2
# Groups:   ID [1]
     ID data            
  <int> <list>          
1     1 <tibble [3 x 2]>

Not sure how you want the output structured, but here is an idea

library(dplyr)

df %>% 
 group_by(ID) %>% 
 mutate(new = sum(fruits %in% rev_vec) == n())

# A tibble: 9 x 3
# Groups:   ID [3]
     ID fruits    new  
  <int> <chr>     <lgl>
1     1 apple     TRUE 
2     1 pear      TRUE 
3     1 banana    TRUE 
4     2 Pineapple FALSE
5     2 Pineapple FALSE
6     2 orange    FALSE
7     3 lime      FALSE
8     3 pear      FALSE
9     3 NA        FALSE

Another output,

df %>% 
 group_by(ID) %>% 
 mutate(new = sum(fruits %in% rev_vec) == n()) %>% 
 filter(new) %>% 
 nest()

# A tibble: 1 x 2
# Groups:   ID [1]
     ID data            
  <int> <list>          
1     1 <tibble [3 x 2]>
机场等船 2025-02-16 04:41:25

也许您可以尝试使用相同的来查看每个iD> ID水果是否与参考向量完全相同。

library(tidyverse)

df %>%
  group_by(ID) %>%
  filter(identical(fruits, rev_vec))

输出

     ID fruits
  <int> <chr> 
1     1 apple 
2     1 pear  
3     1 banana

Perhaps you could try using identical to see if the fruits for each ID are exactly identical to the reference vector.

library(tidyverse)

df %>%
  group_by(ID) %>%
  filter(identical(fruits, rev_vec))

Output

     ID fruits
  <int> <chr> 
1     1 apple 
2     1 pear  
3     1 banana
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文