当多个变量分组时，在数据框中组中的编号行

发布于 2025-02-11 16:23:48 字数 1361 浏览 2 评论 0原文

是的，有几个与此相似的问题，但是它们都不涉及多个分组变量，这导致解决这些问题的解决方案无法正常工作。我能找到的最好的类似问题是：

数据中的编号行帧和

>为我的情况

library(dplyr)
df <- tibble(
    s1 = c("111", "111", "111", "112", "112", "114", "114", "115"),
    s2 = c(rep("A", 5), rep("B", 3)),
    val = rnorm(8)
)

：我想在组中为s1提供一个分组ID。 S2 。也就是说，我希望它每次重置s2更改。所需的输出：

# A tibble: 8 x 4
  s1    s2        val    id
  <chr> <chr>   <dbl> <dbl>
1 111   A     -0.465      1
2 111   A      0.871      1
3 111   A      0.823      1
4 112   A      0.561      2
5 112   A      0.197      2
6 114   B     -0.743      1
7 114   B      0.0847     1
8 115   B     -1.05       2

针对类似问题的建议解决方案之一是

library(dplyr)
df %>% group_by(s1) %>% mutate(id = row_number())

每次重置s1更改。同样，这些也不起作用：

df %>% group_by(s1, s2) %>% mutate(id = row_number())
df %>% group_by(s2) %>% mutate(id = row_number())
df %>% group_by(s1) %>% mutate(id = row_number(s2))
df %>% group_by(s1) %>% mutate(id = cur_group_id())
df %>% group_by(s1, s2) %>% mutate(id = cur_group_id())

原文

Yes, there are several questions similar to this, but none of them involve multiple grouped variables, which causes the solutions for those questions to not work properly. The best similar questions I can find are:

Dummy data for my case:

library(dplyr)
df <- tibble(
    s1 = c("111", "111", "111", "112", "112", "114", "114", "115"),
    s2 = c(rep("A", 5), rep("B", 3)),
    val = rnorm(8)
)

I want to provide a grouping ID for s1 within group s2. That is, I want it to reset each time s2 changes. Desired output:

# A tibble: 8 x 4
  s1    s2        val    id
  <chr> <chr>   <dbl> <dbl>
1 111   A     -0.465      1
2 111   A      0.871      1
3 111   A      0.823      1
4 112   A      0.561      2
5 112   A      0.197      2
6 114   B     -0.743      1
7 114   B      0.0847     1
8 115   B     -1.05       2

One of the suggested solutions for similar questions is

library(dplyr)
df %>% group_by(s1) %>% mutate(id = row_number())

but that resets each time s1 changes. Similarly, these did not work either:

df %>% group_by(s1, s2) %>% mutate(id = row_number())
df %>% group_by(s2) %>% mutate(id = row_number())
df %>% group_by(s1) %>% mutate(id = row_number(s2))
df %>% group_by(s1) %>% mutate(id = cur_group_id())
df %>% group_by(s1, s2) %>% mutate(id = cur_group_id())

分享到QQ

分享到微博