如何在 dplyr 和 R 中汇总和子集多级分组数据帧
我有以下长格式数据:
testdf <- tibble(
name = c(rep("john", 4), rep("joe", 2)),
rep = c(1, 1, 2, 2, 1, 1),
field = rep(c("pet", "age"), 3),
value = c("dog", "young", "cat", "old", "fish", "young")
)
对于每个指定的人(约翰和乔),我想总结他们的每只宠物:
由于某种原因,我似乎无法处理“约翰”数据中的重复事件/宠物。
如果我只过滤乔(只有一只宠物),则代码有效。
任何帮助非常感谢...
testdf %>%
group_by(name, rep) %>%
# filter(name == "joe") %>% # when I filter only for Joe, the code works
summarise(
about = paste0(
"The pet is a: ", .[field == "pet", "value"], " and it is ", .[field == "age", "value"]
)
)
I have the following data in long format:
testdf <- tibble(
name = c(rep("john", 4), rep("joe", 2)),
rep = c(1, 1, 2, 2, 1, 1),
field = rep(c("pet", "age"), 3),
value = c("dog", "young", "cat", "old", "fish", "young")
)
For each named person (John and Joe), I want to summarise EACH of their pets:
For some reason I can't seem to deal with the repeating events/pets in "John" data.
If I filter just for Joe (only has one pet), the code works.
Any help much appreciated...
testdf %>%
group_by(name, rep) %>%
# filter(name == "joe") %>% # when I filter only for Joe, the code works
summarise(
about = paste0(
"The pet is a: ", .[field == "pet", "value"], " and it is ", .[field == "age", "value"]
)
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
更多
发布评论
评论(2)
这也可以通过 data.table 来完成,如下所示:
This can also be done with data.table, as follows:
您的数据格式较长且不整齐,其中包含多个字段。因此,郎唐回答的就是将其扩展或转向更广泛。 (更好的是使用 data.table,但我发现使用 .SD 仍然很困难]
我更喜欢在 dplyr 中尽可能简单地完成这些事情。
另一种不扩散的方法如下,它产生相同的结果。 [没有data.table其中.SD对我来说仍然很难掌握!
所以在 3 行中:
产量:
Your data is long format and not tidy, with multiple fields in one. So spread it or pivot wider is what answered by langtang. (better is with data.table but I find it difficult still to use .SD]
I prefer doing these things as simple as possible in dplyr.
An alternative -without spreading is as follows which yields same results. [Without data.table where .SD is still difficult for me to grasp!
so in 3 lines:
yields: