当多个变量分组时,在数据框中组中的编号行

发布于 2025-02-11 16:23:48 字数 1361 浏览 2 评论 0原文

是的,有几个与此相似的问题,但是它们都不涉及多个分组变量,这导致解决这些问题的解决方案无法正常工作。我能找到的最好的类似问题是:

  1. 数据中的编号行帧

>为我的情况

library(dplyr)
df <- tibble(
    s1 = c("111", "111", "111", "112", "112", "114", "114", "115"),
    s2 = c(rep("A", 5), rep("B", 3)),
    val = rnorm(8)
)

:我想在组中为s1提供一个分组ID。 S2 。也就是说,我希望它每次重置s2更改。所需的输出:

# A tibble: 8 x 4
  s1    s2        val    id
  <chr> <chr>   <dbl> <dbl>
1 111   A     -0.465      1
2 111   A      0.871      1
3 111   A      0.823      1
4 112   A      0.561      2
5 112   A      0.197      2
6 114   B     -0.743      1
7 114   B      0.0847     1
8 115   B     -1.05       2

针对类似问题的建议解决方案之一是

library(dplyr)
df %>% group_by(s1) %>% mutate(id = row_number())

每次重置s1更改。同样,这些也不起作用:

df %>% group_by(s1, s2) %>% mutate(id = row_number())
df %>% group_by(s2) %>% mutate(id = row_number())
df %>% group_by(s1) %>% mutate(id = row_number(s2))
df %>% group_by(s1) %>% mutate(id = cur_group_id())
df %>% group_by(s1, s2) %>% mutate(id = cur_group_id())

Yes, there are several questions similar to this, but none of them involve multiple grouped variables, which causes the solutions for those questions to not work properly. The best similar questions I can find are:

  1. Numbering rows within groups in a data frame, and
  2. Create a sequential number (counter) for rows within each group of a dataframe [duplicate]

Dummy data for my case:

library(dplyr)
df <- tibble(
    s1 = c("111", "111", "111", "112", "112", "114", "114", "115"),
    s2 = c(rep("A", 5), rep("B", 3)),
    val = rnorm(8)
)

I want to provide a grouping ID for s1 within group s2. That is, I want it to reset each time s2 changes. Desired output:

# A tibble: 8 x 4
  s1    s2        val    id
  <chr> <chr>   <dbl> <dbl>
1 111   A     -0.465      1
2 111   A      0.871      1
3 111   A      0.823      1
4 112   A      0.561      2
5 112   A      0.197      2
6 114   B     -0.743      1
7 114   B      0.0847     1
8 115   B     -1.05       2

One of the suggested solutions for similar questions is

library(dplyr)
df %>% group_by(s1) %>% mutate(id = row_number())

but that resets each time s1 changes. Similarly, these did not work either:

df %>% group_by(s1, s2) %>% mutate(id = row_number())
df %>% group_by(s2) %>% mutate(id = row_number())
df %>% group_by(s1) %>% mutate(id = row_number(s2))
df %>% group_by(s1) %>% mutate(id = cur_group_id())
df %>% group_by(s1, s2) %>% mutate(id = cur_group_id())

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文