分组汇总仍然给出每个单独行的结果

发布于 2025-01-13 01:21:32 字数 1068 浏览 4 评论 0原文

我有以下数据:

library(tidyverse)

df <- data.frame(id = c(1,1,1,2,2,2),
                 x = rep(letters[1:2], each = 3),
                 y = c(3,4,3,5,6,5),
                 z = c(7,8,9,10,11,12))

我现在想按 id 汇总数据,根据 y 值获取 z 总和。 y 条件本身取决于 x 的值。

我以为我可以使用下面的代码,但这给了我所有输入 ID 并且没有总结。结果是正确的,但我仍然希望每个 id 一行。

df %>%
  group_by(id) %>%
  summarize(test = case_when(x == 'a' ~ sum(z[y == 3]),
                             x == 'b' ~ sum(z[y == 5])))

# A tibble: 6 x 2
# Groups:   id [2]
     id  test
  <dbl> <dbl>
1     1    16
2     1    16
3     1    16
4     2    22
5     2    22
6     2    22

以下有效,但我不明白为什么它有效,而上面的代码却无效。

df %>%
  group_by(id) %>%
  summarize(test = case_when(all(x == 'a') ~ sum(z[y == 3]),
                             all(x == 'b') ~ sum(z[y == 5])))

# A tibble: 2 x 2
     id  test
  <dbl> <dbl>
1     1    16
2     2    22

另外,有没有更直接的方法来进行总结?

I have the following data:

library(tidyverse)

df <- data.frame(id = c(1,1,1,2,2,2),
                 x = rep(letters[1:2], each = 3),
                 y = c(3,4,3,5,6,5),
                 z = c(7,8,9,10,11,12))

I now want to summarize the data by id in a way where I get the sum of z depending on y values. The y condition itself depends on the value of x.

I thought I could use the code below, but this gives me all input ids and doesn‘t summarize. The result is correct, but I still want to have one row per id.

df %>%
  group_by(id) %>%
  summarize(test = case_when(x == 'a' ~ sum(z[y == 3]),
                             x == 'b' ~ sum(z[y == 5])))

# A tibble: 6 x 2
# Groups:   id [2]
     id  test
  <dbl> <dbl>
1     1    16
2     1    16
3     1    16
4     2    22
5     2    22
6     2    22

The following works, but I don‘t understand why it does and the above code does not.

df %>%
  group_by(id) %>%
  summarize(test = case_when(all(x == 'a') ~ sum(z[y == 3]),
                             all(x == 'b') ~ sum(z[y == 5])))

# A tibble: 2 x 2
     id  test
  <dbl> <dbl>
1     1    16
2     2    22

Also, is there a more straigthforward way to do my summarization?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦里泪两行 2025-01-20 01:21:32

因为,与 ifelse(test, x, y) 类似的 case_when 会返回与 test 长度相同的向量。 all(x == z) 的长度为 1,因此返回值的长度为 1。

Because, case_when similar to ifelse(test, x, y) will return a vector of the same length as test. all(x == z) has length 1 and so the returned valued is of length 1.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文