R：当纵向数据中某一列的值相同时，如何保留相同 ID 中的最后 2-3 行？

发布于 2025-01-12 02:52:28 字数 703 浏览 5 评论 0原文

使用 R，我想选择相同 ID 内的最后一行作为纵向数据。但是，当时间列中的值相同（例如，ID 1 的值 5 和 ID 3 的值 4）时，我希望在相同的 ID 中保留最后 2-3 行（ID 1 的 2 行） ID 3) 为 3 行。如果相同 ID 中时间列中的值不同，我只想保留最后一行（例如 ID 2 的值 7）。

我的数据框如下：

id time    dx    code
1   1   primary   A1
1   5   primary   D2
1   5   secondary B3
2   1   primary   A2
2   7   primary   C4
3   4   primary   A1
3   4   secondary B3
3   4   tertiary  D2

我想要以下结果：

id time    dx    code
1   5   primary   D2
1   5   secondary B3
2   7   primary   C4
3   4   primary   A1
3   4   secondary B3
3   4   tertiary  D2

当我使用以下 R 脚本时，d %>% group_by(id) %>% filter(row_number() == n())，这些仅保留每个 ID 中的最后一行。任何帮助将不胜感激！

原文

Using R, I would like to select the last rows within the same IDs for longitudinal data. However, I would like to keep 2-3 last rows within the same IDs when values in the time column are the same (e.g., value 5 for ID 1 and value 4 for ID 3) for the last rows (2 rows for ID 1 and 3 rows for ID 3). If the values are different in the time column within the same IDs, I want to keep the last row only (e.g., value 7 for ID 2).

My dataframe is as follows:

id time    dx    code
1   1   primary   A1
1   5   primary   D2
1   5   secondary B3
2   1   primary   A2
2   7   primary   C4
3   4   primary   A1
3   4   secondary B3
3   4   tertiary  D2

I want the following results:

id time    dx    code
1   5   primary   D2
1   5   secondary B3
2   7   primary   C4
3   4   primary   A1
3   4   secondary B3
3   4   tertiary  D2

When I used the following R scripts, d %>% group_by(id) %>% filter(row_number() == n()), these only kept the last row within each ID. Any help would be appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

离笑几人歌 2025-01-19 02:52:28

您也可以 group_by dx 并使用 slice_tail：

dat %>% 
  group_by(id, dx) %>% 
  slice_tail(n = 1)

# A tibble: 6 x 4
# Groups:   id, dx [6]
     id  time dx        code 
  <int> <int> <chr>     <chr>
1     1     5 primary   D2   
2     1     5 secondary B3   
3     2     7 primary   C4   
4     3     4 primary   A1   
5     3     4 secondary B3   
6     3     4 tertiary  D2

You can group_by dx as well and use slice_tail:

dat %>% 
  group_by(id, dx) %>% 
  slice_tail(n = 1)

# A tibble: 6 x 4
# Groups:   id, dx [6]
     id  time dx        code 
  <int> <int> <chr>     <chr>
1     1     5 primary   D2   
2     1     5 secondary B3   
3     2     7 primary   C4   
4     3     4 primary   A1   
5     3     4 secondary B3   
6     3     4 tertiary  D2

回复收藏 0 原文

~没有更多了~