在 R 中，根据按第三列分组的第二列的顺序创建一个新列

发布于 2025-01-10 17:45:12 字数 1879 浏览 0 评论 0原文

这与其他一些问题非常相似，但我对其他答案不太满意。

我有数据，其中一栏是拉丁方研究设计的结果，其中参与者有三个条件，这些条件可能有六种可能的顺序。我没有一个变量来指示参与者实际收到研究条件的顺序，因此需要自己创建一个变量。这是我当前和期望的输出，使用前三位参与者的假示例：

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
(current <- tibble(
    participant = c(1,1,1,2,2,2,3,3,3),
    block_code = c("timed", "untimed", "practice", "untimed", "practice", "timed", "timed", "untimed", "practice")
    ))
#> # A tibble: 9 × 2
#>   participant block_code
#>         <dbl> <chr>     
#> 1           1 timed     
#> 2           1 untimed   
#> 3           1 practice  
#> 4           2 untimed   
#> 5           2 practice  
#> 6           2 timed     
#> 7           3 timed     
#> 8           3 untimed   
#> 9           3 practice
(desired <- current %>%
    mutate(order_code = c(rep("tup", 3), rep("upt", 3), rep("tup", 3))))
#> # A tibble: 9 × 3
#>   participant block_code order_code
#>         <dbl> <chr>      <chr>     
#> 1           1 timed      tup       
#> 2           1 untimed    tup       
#> 3           1 practice   tup       
#> 4           2 untimed    upt       
#> 5           2 practice   upt       
#> 6           2 timed      upt       
#> 7           3 timed      tup       
#> 8           3 untimed    tup       
#> 9           3 practice   tup

^{由 reprex package (v2.0.1)}

参与者 1 和 3 有相同的顺序，因此他们最终得到相同的代码。

如何告诉 R 根据参与者中 block_code 变量的顺序创建一个新列？

原文

This is very similar to some other questions, but I wasn't quite satisfied with the other answers.

I have data where one column is the outcome of a Latin Square study design, where a participant had three conditions that could have come in six possible orders. I do not have a variable that indicates the order that the participant actually received the study conditions, and so need to create one myself. Here is my current and desired output using a fake example from the first three participants:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
(current <- tibble(
    participant = c(1,1,1,2,2,2,3,3,3),
    block_code = c("timed", "untimed", "practice", "untimed", "practice", "timed", "timed", "untimed", "practice")
    ))
#> # A tibble: 9 × 2
#>   participant block_code
#>         <dbl> <chr>     
#> 1           1 timed     
#> 2           1 untimed   
#> 3           1 practice  
#> 4           2 untimed   
#> 5           2 practice  
#> 6           2 timed     
#> 7           3 timed     
#> 8           3 untimed   
#> 9           3 practice
(desired <- current %>%
    mutate(order_code = c(rep("tup", 3), rep("upt", 3), rep("tup", 3))))
#> # A tibble: 9 × 3
#>   participant block_code order_code
#>         <dbl> <chr>      <chr>     
#> 1           1 timed      tup       
#> 2           1 untimed    tup       
#> 3           1 practice   tup       
#> 4           2 untimed    upt       
#> 5           2 practice   upt       
#> 6           2 timed      upt       
#> 7           3 timed      tup       
#> 8           3 untimed    tup       
#> 9           3 practice   tup

^{Created on 2022-02-28 by the reprex package (v2.0.1)}

Participants 1 and 3 had the same order, so they ended up with the same code.

How can I tell R to create a new column based on the order of the block_code variable within a participant?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

思念绕指尖 2025-01-17 17:45:12

另一个稍有不同的选项是使用summarise，这样您就可以放弃分组而无需取消分组。在这里，我们按参与者进行分组，然后仅将每个组的第一个字母折叠在一起。

library(tidyverse)

current %>%
  group_by(participant) %>%
  summarise(
    block_code,
    order_code = paste(substr(block_code, 0, 1), collapse = ""),
    .groups = "drop"
  )

输出

  participant block_code order_code
        <dbl> <chr>      <chr>     
1           1 timed      tup       
2           1 untimed    tup       
3           1 practice   tup       
4           2 untimed    upt       
5           2 practice   upt       
6           2 timed      upt       
7           3 timed      tup       
8           3 untimed    tup       
9           3 practice   tup

或使用 data.table：

library("data.table")
dt <- as.data.table(current)

dt[, order_code := paste(substr(block_code, 0, 1), collapse = ""), by = participant]

或使用基本 R：

merge(current, setNames(
  aggregate(
    block_code ~ participant,
    data = current,
    FUN = \(x) paste(substr(x, 0, 1), collapse = "")
  ),
  c("participant", "order_code")
), by = "participant")

Another slightly different option is to use summarise so that you can drop the grouping without having to ungroup. Here, we group by the participant, then collapse together only the first letter for each group.

library(tidyverse)

current %>%
  group_by(participant) %>%
  summarise(
    block_code,
    order_code = paste(substr(block_code, 0, 1), collapse = ""),
    .groups = "drop"
  )

Output

  participant block_code order_code
        <dbl> <chr>      <chr>     
1           1 timed      tup       
2           1 untimed    tup       
3           1 practice   tup       
4           2 untimed    upt       
5           2 practice   upt       
6           2 timed      upt       
7           3 timed      tup       
8           3 untimed    tup       
9           3 practice   tup

Or with data.table:

library("data.table")
dt <- as.data.table(current)

dt[, order_code := paste(substr(block_code, 0, 1), collapse = ""), by = participant]

Or with base R:

merge(current, setNames(
  aggregate(
    block_code ~ participant,
    data = current,
    FUN = \(x) paste(substr(x, 0, 1), collapse = "")
  ),
  c("participant", "order_code")
), by = "participant")

回复收藏 0 原文

风吹雪碎 2025-01-17 17:45:12

您可以group_by(participant)，然后通过折叠每个block_code的第一个字母来创建order_code：

library(tidyverse)

(current %>% 
  group_by(participant) %>% 
  mutate(order_code = str_c(str_sub(block_code, end = 1), collapse = "")) %>% 
  ungroup())
#> # A tibble: 9 x 3
#>   participant block_code order_code
#>         <dbl> <chr>      <chr>     
#> 1           1 timed      tup       
#> 2           1 untimed    tup       
#> 3           1 practice   tup       
#> 4           2 untimed    upt       
#> 5           2 practice   upt       
#> 6           2 timed      upt       
#> 7           3 timed      tup       
#> 8           3 untimed    tup       
#> 9           3 practice   tup

^{创建于2022年2月28日通过 reprex 包 (v2.0.1)}

You can group_by(participant), then create order_code by collapsing the first letter of each block_code:

library(tidyverse)

(current %>% 
  group_by(participant) %>% 
  mutate(order_code = str_c(str_sub(block_code, end = 1), collapse = "")) %>% 
  ungroup())
#> # A tibble: 9 x 3
#>   participant block_code order_code
#>         <dbl> <chr>      <chr>     
#> 1           1 timed      tup       
#> 2           1 untimed    tup       
#> 3           1 practice   tup       
#> 4           2 untimed    upt       
#> 5           2 practice   upt       
#> 6           2 timed      upt       
#> 7           3 timed      tup       
#> 8           3 untimed    tup       
#> 9           3 practice   tup

^{Created on 2022-02-28 by the reprex package (v2.0.1)}

回复收藏 0 原文

~没有更多了~