如何使用dplyr的cocece函数与group_by()创建一个填充所有值的每一行?

发布于 2025-02-03 07:34:33 字数 990 浏览 4 评论 0原文

我正在尝试使用cocece()每位参与者的一排,其名称和得分。参与者有3个填写数据的机会,大多数人只有一次(多次出现的人总是放入相同的数据)。因此,我的数据看起来像:

library(dplyr)

test_dataset <- tibble(name = c("justin", "justin", "justin", "corey", "corey", "corey", "sib", "sib", "sib", "kate", "kate", "kate"),
                       score1 = c(NA_real_, NA_real_, 1, 2, NA_real_, NA_real_, 2, NA_real_, 2, NA_real_, NA_real_ , NA_real_),
                       score2 = c(NA_real_, 7, NA_real_, 5, NA_real_, NA_real_, 9, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_))

我希望它看起来像:

library(dplyr)

answer <- tibble(name = c("justin", "corey", "sib", "kate"),
                       score1_true = c(1, 2, 2, NA),
                       score2_true = c(7, 5, 9, NA))

我尝试了以下解决方案,这确实给了我“ True”分数,但是它分布在12行(每人3行)上,而不是4(一个)人):

library(dplyr)

test_dataset %>%
  dplyr::group_by(name) %>%
  mutate(across(c(starts_with("score")), .fns = list(true = ~coalesce(.))))

I am trying to use coalesce() to produce one row per participant that has their name and their score. Participants had 3 opportunities to fill in their data, and most only came in once (and those that came in multiple times always put in the same data). So my data looks like:

library(dplyr)

test_dataset <- tibble(name = c("justin", "justin", "justin", "corey", "corey", "corey", "sib", "sib", "sib", "kate", "kate", "kate"),
                       score1 = c(NA_real_, NA_real_, 1, 2, NA_real_, NA_real_, 2, NA_real_, 2, NA_real_, NA_real_ , NA_real_),
                       score2 = c(NA_real_, 7, NA_real_, 5, NA_real_, NA_real_, 9, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_))

And I want it to look like:

library(dplyr)

answer <- tibble(name = c("justin", "corey", "sib", "kate"),
                       score1_true = c(1, 2, 2, NA),
                       score2_true = c(7, 5, 9, NA))

I've tried the below solution, which does give me the 'true' score, but it's spread out over 12 rows (3 rows per person) instead of 4 (one per person):

library(dplyr)

test_dataset %>%
  dplyr::group_by(name) %>%
  mutate(across(c(starts_with("score")), .fns = list(true = ~coalesce(.))))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蘑菇王子 2025-02-10 07:34:33

您可以使用fill(),然后 repand> repand()分数并使用slice_head()

test_dataset %>% 
  group_by(name) %>%
  fill(score1, score2) %>%
  arrange(score1, score2) %>%
  slice_head(n=1)

输出:

  name   score1_true score2_true
  <chr>        <dbl>       <dbl>
1 justin           1           7
2 corey            2           5
3 sib              2           9
4 kate            NA          NA

更多简洁/改进版本,多亏了@m.viking:

  • 使用.direction =“ up”选项fill()
test_dataset %>% 
  group_by(name) %>%
  fill(score1, score2, .direction="up") %>%
  slice_head(n=1)

You can use fill(), and then arrange() the scores and use slice_head():

test_dataset %>% 
  group_by(name) %>%
  fill(score1, score2) %>%
  arrange(score1, score2) %>%
  slice_head(n=1)

Output:

  name   score1_true score2_true
  <chr>        <dbl>       <dbl>
1 justin           1           7
2 corey            2           5
3 sib              2           9
4 kate            NA          NA

more concise/improved version thanks to @M.Viking:

  • use the .direction="up" option within fill()
test_dataset %>% 
  group_by(name) %>%
  fill(score1, score2, .direction="up") %>%
  slice_head(n=1)
饮湿 2025-02-10 07:34:33

我们可以根据na元素对值进行重新排序,然后将第一行

library(dplyr)
test_dataset %>% 
  group_by(name) %>%
  dplyr::mutate(across(starts_with('score'), 
   ~ .x[order(is.na(.x))])) %>% 
  slice_head(n = 1) %>% 
  ungroup

-Output

# A tibble: 4 × 3
  name   score1 score2
  <chr>   <dbl>  <dbl>
1 corey       2      5
2 justin      1      7
3 kate       NA     NA
4 sib         2      9

之后使用complete.cases

test_dataset %>% 
  group_by(name) %>%
  dplyr::mutate(across(starts_with('score'), 
   ~ .x[order(is.na(.x))])) %>% 
  filter(complete.cases(across(starts_with('score')))|row_number() == 1) %>%
   ungroup

切成 postif.cases 在重新安排-OUTPUT

# A tibble: 4 × 3
  name   score1 score2
  <chr>   <dbl>  <dbl>
1 justin      1      7
2 corey       2      5
3 sib         2      9
4 kate       NA     NA

We could reorder the values based on the NA elements and then slice the first row

library(dplyr)
test_dataset %>% 
  group_by(name) %>%
  dplyr::mutate(across(starts_with('score'), 
   ~ .x[order(is.na(.x))])) %>% 
  slice_head(n = 1) %>% 
  ungroup

-output

# A tibble: 4 × 3
  name   score1 score2
  <chr>   <dbl>  <dbl>
1 corey       2      5
2 justin      1      7
3 kate       NA     NA
4 sib         2      9

Or another option is to use complete.cases after rearranging

test_dataset %>% 
  group_by(name) %>%
  dplyr::mutate(across(starts_with('score'), 
   ~ .x[order(is.na(.x))])) %>% 
  filter(complete.cases(across(starts_with('score')))|row_number() == 1) %>%
   ungroup

-output

# A tibble: 4 × 3
  name   score1 score2
  <chr>   <dbl>  <dbl>
1 justin      1      7
2 corey       2      5
3 sib         2      9
4 kate       NA     NA
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文