ivot_longer 成几对列

发布于 2025-01-11 17:17:53 字数 1617 浏览 4 评论 0原文

我需要跨多组列进行pivot_longer,创建多个名称-值对。

例如,我需要从这样的东西开始:

df_raw <- tribble(
  ~id, ~belief_dog, ~belief_bull_frog, ~belief_fish, ~age, ~norm_bull_frog, ~norm_fish, ~norm_dog, ~gender,
  "b2x8",    1,           4,          3,         41,     4,       2,          10,         2,
  "m89w",    3,           6,          2,         19,     1,       2,           3,         1,
  "32x8",    1,           5,          2,         38,     9,       1,           8,         3
)

并将其变成这样的东西:

df_final <- tribble(
  ~id,   ~belief_animal, ~belief_rating, ~norm_animal, ~norm_rating, ~age,   ~gender,
  "b2x8",    "dog",           1,          "bull_frog",      4,        41,       2,
  "b2x8",    "bull_frog",     4,          "fish",           2,        41,       2,
  "b2x8",    "fish",          3,          "dog",            10,       41,       2,
  "m89w",    "dog",           3,          "bull_frog",      1,        19,       1,
  "m89w",    "bull_frog",     6,          "fish",           2,        19,       1,
  "m89w",    "fish",          2,          "dog",            3,        19,       1,
  "32x8",    "dog",           1,          "bull_frog",      9,        38,       3,
  "32x8",    "bull_frog",     5,          "fish",           1,        38,       3,
  "32x8",    "fish",          2,          "dog",            8,        38,       3
)

换句话说,任何以“belief_”开头的东西都应该以一个名称为中心——值对和值对。任何以“norm_”开头的内容都应该转换为另一个名称-值对。

我尝试查看其他几个具有相关内容的 Stack Overflow 页面,但无法将这些解决方案转化为这种情况。

任何帮助将不胜感激,强烈偏好dplyr解决方案。

谢谢!

I'm needing to pivot_longer across multiple groups of columns, creating multiple names--values pairs.

For instance, I need to go from something like this:

df_raw <- tribble(
  ~id, ~belief_dog, ~belief_bull_frog, ~belief_fish, ~age, ~norm_bull_frog, ~norm_fish, ~norm_dog, ~gender,
  "b2x8",    1,           4,          3,         41,     4,       2,          10,         2,
  "m89w",    3,           6,          2,         19,     1,       2,           3,         1,
  "32x8",    1,           5,          2,         38,     9,       1,           8,         3
)

And turn it into something lie this:

df_final <- tribble(
  ~id,   ~belief_animal, ~belief_rating, ~norm_animal, ~norm_rating, ~age,   ~gender,
  "b2x8",    "dog",           1,          "bull_frog",      4,        41,       2,
  "b2x8",    "bull_frog",     4,          "fish",           2,        41,       2,
  "b2x8",    "fish",          3,          "dog",            10,       41,       2,
  "m89w",    "dog",           3,          "bull_frog",      1,        19,       1,
  "m89w",    "bull_frog",     6,          "fish",           2,        19,       1,
  "m89w",    "fish",          2,          "dog",            3,        19,       1,
  "32x8",    "dog",           1,          "bull_frog",      9,        38,       3,
  "32x8",    "bull_frog",     5,          "fish",           1,        38,       3,
  "32x8",    "fish",          2,          "dog",            8,        38,       3
)

In other words, anything starting with "belief_" should get pivoted in one names--values pair & anything starting with "norm_" should be pivoted into another names--values pair.

I tried looking at several other Stack Overflow pages with somewhat related content but wasn't able to translate those solutions to this situation.

Any help would be appreciated, with a strong preference for dplyr solutions.

THANKS!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

尤怨 2025-01-18 17:17:53

使用 tidyverse,您可以以 beliefnorm 开头的两组列为中心。然后,使用正则表达式根据第一个下划线进行分组(因为某些列名称有多个下划线)。本质上,我们将信念或规范(列名称中的第一组)放入它们自己的列中(即.value),然后该组的第二部分(即动物名称)被放入名为 animal 的列中。

library(tidyverse)

df_raw %>%
  pivot_longer(cols = c(starts_with("belief"), starts_with("norm")),
               names_to = c('.value', 'animal'),
               names_pattern = '(.*?)_(.*)') %>% 
  rename(belief_rating = belief, norm_rating = norm)

输出

  id      age gender animal    belief_rating norm_rating
  <chr> <dbl>  <dbl> <chr>             <dbl>       <dbl>
1 b2x8     41      2 dog                   1          10
2 b2x8     41      2 bull_frog             4           4
3 b2x8     41      2 fish                  3           2
4 m89w     19      1 dog                   3           3
5 m89w     19      1 bull_frog             6           1
6 m89w     19      1 fish                  2           2
7 32x8     38      3 dog                   1           8
8 32x8     38      3 bull_frog             5           9
9 32x8     38      3 fish                  2           1

With tidyverse, you can pivot on the two sets of columns that starts with belief and norm. Then, use regex to split into groups according to the first underscore (since some column names have multiple underscores). Essentially, we are putting belief or norm (the first group in the column name) into their own columns (i.e., .value), then the second part of the group (i.e., animal names) are put into one column named animal.

library(tidyverse)

df_raw %>%
  pivot_longer(cols = c(starts_with("belief"), starts_with("norm")),
               names_to = c('.value', 'animal'),
               names_pattern = '(.*?)_(.*)') %>% 
  rename(belief_rating = belief, norm_rating = norm)

Output

  id      age gender animal    belief_rating norm_rating
  <chr> <dbl>  <dbl> <chr>             <dbl>       <dbl>
1 b2x8     41      2 dog                   1          10
2 b2x8     41      2 bull_frog             4           4
3 b2x8     41      2 fish                  3           2
4 m89w     19      1 dog                   3           3
5 m89w     19      1 bull_frog             6           1
6 m89w     19      1 fish                  2           2
7 32x8     38      3 dog                   1           8
8 32x8     38      3 bull_frog             5           9
9 32x8     38      3 fish                  2           1
谎言 2025-01-18 17:17:53

通过更多的实验解决了这个问题!

关键在于 names_tonames_tonames_pattern 参数。

df_raw %>% pivot_longer(
  cols = c(belief_dog:belief_fish, norm_bull_frog:norm_dog),
  names_to = c(".value", "rating"),
  names_pattern = "([a-z]+)_*(.+)"
)

我不太明白 ".value" 或正则表达式 "([az]+)_*(.+)" 是如何工作的,但解决方案仍然有效。

Solved it with a bit more experimentation!

The key comes down to both the names_to & the names_pattern arguments.

df_raw %>% pivot_longer(
  cols = c(belief_dog:belief_fish, norm_bull_frog:norm_dog),
  names_to = c(".value", "rating"),
  names_pattern = "([a-z]+)_*(.+)"
)

I don't really understand how ".value" or the regex "([a-z]+)_*(.+)" work, but the solution works nonetheless.

余生再见 2025-01-18 17:17:53

对于这些数据:

library(dplyr)
library(tidyr)

df_raw %>% 
  pivot_longer(
    cols = -c(id, age, gender),
    names_to = "name1",
    values_to = "belief_rating"
  ) %>% 
  separate(name1, c("A", "B"), sep = '\\_' , extra = 'merge') %>%  
  group_by(id) %>% 
  mutate(helper = rep(row_number(), each=3, length.out = n())) %>% 
  pivot_wider(
    names_from = A,
    values_from = B,
    names_glue = "{A}_animal"
  ) %>% 
  mutate(norm_rating = ifelse(helper == 1, lead(belief_rating, 3), NA),
         norm_animal = ifelse(helper == 1, lead(norm_animal, 3), NA)) %>% 
  slice(1:3) %>% 
  select(id, belief_animal, belief_rating, norm_animal, norm_rating, age, gender)
  id    belief_animal belief_rating norm_animal norm_rating   age gender
  <chr> <chr>                 <dbl> <chr>             <dbl> <dbl>  <dbl>
1 32x8  dog                       1 bull_frog             9    38      3
2 32x8  bull_frog                 5 fish                  1    38      3
3 32x8  fish                      2 dog                   8    38      3
4 b2x8  dog                       1 bull_frog             4    41      2
5 b2x8  bull_frog                 4 fish                  2    41      2
6 b2x8  fish                      3 dog                  10    41      2
7 m89w  dog                       3 bull_frog             1    19      1
8 m89w  bull_frog                 6 fish                  2    19      1
9 m89w  fish                      2 dog                   3    19      1

For these data:

library(dplyr)
library(tidyr)

df_raw %>% 
  pivot_longer(
    cols = -c(id, age, gender),
    names_to = "name1",
    values_to = "belief_rating"
  ) %>% 
  separate(name1, c("A", "B"), sep = '\\_' , extra = 'merge') %>%  
  group_by(id) %>% 
  mutate(helper = rep(row_number(), each=3, length.out = n())) %>% 
  pivot_wider(
    names_from = A,
    values_from = B,
    names_glue = "{A}_animal"
  ) %>% 
  mutate(norm_rating = ifelse(helper == 1, lead(belief_rating, 3), NA),
         norm_animal = ifelse(helper == 1, lead(norm_animal, 3), NA)) %>% 
  slice(1:3) %>% 
  select(id, belief_animal, belief_rating, norm_animal, norm_rating, age, gender)
  id    belief_animal belief_rating norm_animal norm_rating   age gender
  <chr> <chr>                 <dbl> <chr>             <dbl> <dbl>  <dbl>
1 32x8  dog                       1 bull_frog             9    38      3
2 32x8  bull_frog                 5 fish                  1    38      3
3 32x8  fish                      2 dog                   8    38      3
4 b2x8  dog                       1 bull_frog             4    41      2
5 b2x8  bull_frog                 4 fish                  2    41      2
6 b2x8  fish                      3 dog                  10    41      2
7 m89w  dog                       3 bull_frog             1    19      1
8 m89w  bull_frog                 6 fish                  2    19      1
9 m89w  fish                      2 dog                   3    19      1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文