如何在R中使用Pivot_longer（）将列分为多行分类？

发布于 2025-01-31 09:21:49 字数 2988 浏览 3 评论 0原文

这是一些虚构的数据：

tibble(fruit = rep(c("apple", "pear", "orange"), each = 3),
       size = rep(c("big", "medium", "small"), times = 3),
       # summer stock
       shopA_summer_wk1 = abs(round(rnorm(9, 10, 5), 0)),
       shopA_summer_wk2 = abs(round(rnorm(9, 10, 5), 0)),
       shopB_summer_wk1 = abs(round(rnorm(9, 10, 5), 0)),
       shopB_summer_wk2 = abs(round(rnorm(9, 10, 5), 0)),
       shopC_summer_wk1 = abs(round(rnorm(9, 10, 5), 0)),
       shopC_summer_wk2 = abs(round(rnorm(9, 10, 5), 0)),
       # winter stock
       shopA_winter_wk1 = abs(round(rnorm(9, 8, 4), 0)),
       shopA_winter_wk2 = abs(round(rnorm(9, 8, 4), 0)),
       shopA_winter_wk3 = abs(round(rnorm(9, 8, 4), 0)),
       shopB_winter_wk1 = abs(round(rnorm(9, 8, 4), 0)),
       shopB_winter_wk2 = abs(round(rnorm(9, 8, 4), 0)),
       shopB_winter_wk3 = abs(round(rnorm(9, 8, 4), 0)),
       shopC_winter_wk1 = abs(round(rnorm(9, 8, 4), 0)),
       shopC_winter_wk2 = abs(round(rnorm(9, 8, 4), 0)),
       shopC_winter_wk3 = abs(round(rnorm(9, 8, 4), 0)))

在夏季的2周和冬季的3周内收集了一些数据（A，B，C）的一些数据。收集的数据是该商店在那一周中有库存的水果（苹果，梨，橙色）的数量。

这是数据集的前6行：

# fruit  size   shopA_summer_wk1   shopA_summer_wk2 shopB_summer_wk1 shopB_summer_wk2 shopC_summer_wk1 shopC_summer_wk2 shopA_winter_wk1 shopA_winter_wk2 shopA_winter_wk3
#   <chr>  <chr>             <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>
# 1 apple  big                   9               12               12               16               15                5               14                4                0
# 2 apple  medium               21               16               16                1               12               11                8                8                9
# 3 apple  small                10                6               18               18               22               12                4                2                0
# 4 pear   big                  13                7                4               12               13                6               10                6                2
# 5 pear   medium               13               12                8                0                8                5               11                7                3
# 6 pear   small                16               18                4                3               13                8                7                5                0

我想在r中使用pivot_longer（）函数来重组此数据集。鉴于我很难为此编写代码很难。

我希望它看起来像以下内容：

我非常感谢任何输入:)

原文

Here is some fictional data:

tibble(fruit = rep(c("apple", "pear", "orange"), each = 3),
       size = rep(c("big", "medium", "small"), times = 3),
       # summer stock
       shopA_summer_wk1 = abs(round(rnorm(9, 10, 5), 0)),
       shopA_summer_wk2 = abs(round(rnorm(9, 10, 5), 0)),
       shopB_summer_wk1 = abs(round(rnorm(9, 10, 5), 0)),
       shopB_summer_wk2 = abs(round(rnorm(9, 10, 5), 0)),
       shopC_summer_wk1 = abs(round(rnorm(9, 10, 5), 0)),
       shopC_summer_wk2 = abs(round(rnorm(9, 10, 5), 0)),
       # winter stock
       shopA_winter_wk1 = abs(round(rnorm(9, 8, 4), 0)),
       shopA_winter_wk2 = abs(round(rnorm(9, 8, 4), 0)),
       shopA_winter_wk3 = abs(round(rnorm(9, 8, 4), 0)),
       shopB_winter_wk1 = abs(round(rnorm(9, 8, 4), 0)),
       shopB_winter_wk2 = abs(round(rnorm(9, 8, 4), 0)),
       shopB_winter_wk3 = abs(round(rnorm(9, 8, 4), 0)),
       shopC_winter_wk1 = abs(round(rnorm(9, 8, 4), 0)),
       shopC_winter_wk2 = abs(round(rnorm(9, 8, 4), 0)),
       shopC_winter_wk3 = abs(round(rnorm(9, 8, 4), 0)))

Some data is collected for 3 shops (A, B, C) across 2 weeks in the summer and 3 weeks in the winter. The data collected is the number of fruits (apple, pear, orange) per size (big, medium, small) the shop had in stock on that particular week.

Here are the first 6 rows of of the dataset:

# fruit  size   shopA_summer_wk1   shopA_summer_wk2 shopB_summer_wk1 shopB_summer_wk2 shopC_summer_wk1 shopC_summer_wk2 shopA_winter_wk1 shopA_winter_wk2 shopA_winter_wk3
#   <chr>  <chr>             <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>
# 1 apple  big                   9               12               12               16               15                5               14                4                0
# 2 apple  medium               21               16               16                1               12               11                8                8                9
# 3 apple  small                10                6               18               18               22               12                4                2                0
# 4 pear   big                  13                7                4               12               13                6               10                6                2
# 5 pear   medium               13               12                8                0                8                5               11                7                3
# 6 pear   small                16               18                4                3               13                8                7                5                0

I would like to use the pivot_longer() function in R to restructure this dataset. Given that there are quite a few group categories I'm having difficulty in writing the code for this.

I would like it to look something like the following:

I would greatly appreciate any input :)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

哭了丶谁疼 2025-02-07 09:21:49

使用names_pattern参数，我们可以做：

pivot_longer(df, c(-fruit, -size), names_pattern = '(^.*)_wk(.*$)',
              names_to = c('Shop_season', 'week'))
#> # A tibble: 135 x 5
#>    fruit size  Shop_season  week  value
#>    <chr> <chr> <chr>        <chr> <dbl>
#>  1 apple big   shopA_summer 1        11
#>  2 apple big   shopA_summer 2         8
#>  3 apple big   shopB_summer 1         4
#>  4 apple big   shopB_summer 2        24
#>  5 apple big   shopC_summer 1         9
#>  6 apple big   shopC_summer 2        10
#>  7 apple big   shopA_winter 1         9
#>  8 apple big   shopA_winter 2        12
#>  9 apple big   shopA_winter 3         5
#> 10 apple big   shopB_winter 1         5
#> # ... with 125 more rows

您可能还需要独立 shop and Season，因为这些实际上是两个不同的变量：

pivot_longer(df, c(-fruit, -size), names_pattern = '(^.*)_wk(.*$)',
              names_to = c('Shop_season', 'week')) %>%
   separate(Shop_season, into = c('shop', 'season'))
#> # A tibble: 135 x 6
#>    fruit size  shop  season week  value
#>    <chr> <chr> <chr> <chr>  <chr> <dbl>
#>  1 apple big   shopA summer 1        11
#>  2 apple big   shopA summer 2         8
#>  3 apple big   shopB summer 1         4
#>  4 apple big   shopB summer 2        24
#>  5 apple big   shopC summer 1         9
#>  6 apple big   shopC summer 2        10
#>  7 apple big   shopA winter 1         9
#>  8 apple big   shopA winter 2        12
#>  9 apple big   shopA winter 3         5
#> 10 apple big   shopB winter 1         5
#> #... with 125 more rows

Using the names_pattern argument, we can do:

pivot_longer(df, c(-fruit, -size), names_pattern = '(^.*)_wk(.*$)',
              names_to = c('Shop_season', 'week'))
#> # A tibble: 135 x 5
#>    fruit size  Shop_season  week  value
#>    <chr> <chr> <chr>        <chr> <dbl>
#>  1 apple big   shopA_summer 1        11
#>  2 apple big   shopA_summer 2         8
#>  3 apple big   shopB_summer 1         4
#>  4 apple big   shopB_summer 2        24
#>  5 apple big   shopC_summer 1         9
#>  6 apple big   shopC_summer 2        10
#>  7 apple big   shopA_winter 1         9
#>  8 apple big   shopA_winter 2        12
#>  9 apple big   shopA_winter 3         5
#> 10 apple big   shopB_winter 1         5
#> # ... with 125 more rows

You might also want to separate shop and season, since these are really two different variables:

pivot_longer(df, c(-fruit, -size), names_pattern = '(^.*)_wk(.*$)',
              names_to = c('Shop_season', 'week')) %>%
   separate(Shop_season, into = c('shop', 'season'))
#> # A tibble: 135 x 6
#>    fruit size  shop  season week  value
#>    <chr> <chr> <chr> <chr>  <chr> <dbl>
#>  1 apple big   shopA summer 1        11
#>  2 apple big   shopA summer 2         8
#>  3 apple big   shopB summer 1         4
#>  4 apple big   shopB summer 2        24
#>  5 apple big   shopC summer 1         9
#>  6 apple big   shopC summer 2        10
#>  7 apple big   shopA winter 1         9
#>  8 apple big   shopA winter 2        12
#>  9 apple big   shopA winter 3         5
#> 10 apple big   shopB winter 1         5
#> #... with 125 more rows

回复收藏 0 原文

独行侠 2025-02-07 09:21:49

如果数据是dt，则

pivot_longer(
  data = dt,
  cols = -c(fruit:size),
  names_to = c("shop_season", "week"),
  names_pattern = "(.*)_(.*)"
)

输出：

# A tibble: 135 x 5
   fruit size  shop_season  week  value
   <chr> <chr> <chr>        <chr> <dbl>
 1 apple big   shopA_summer wk1      13
 2 apple big   shopA_summer wk2      12
 3 apple big   shopB_summer wk1       9
 4 apple big   shopB_summer wk2       9
 5 apple big   shopC_summer wk1       7
 6 apple big   shopC_summer wk2      17
 7 apple big   shopA_winter wk1      10
 8 apple big   shopA_winter wk2      17
 9 apple big   shopA_winter wk3      12
10 apple big   shopB_winter wk1       8

If data is dt, then

pivot_longer(
  data = dt,
  cols = -c(fruit:size),
  names_to = c("shop_season", "week"),
  names_pattern = "(.*)_(.*)"
)

Output:

# A tibble: 135 x 5
   fruit size  shop_season  week  value
   <chr> <chr> <chr>        <chr> <dbl>
 1 apple big   shopA_summer wk1      13
 2 apple big   shopA_summer wk2      12
 3 apple big   shopB_summer wk1       9
 4 apple big   shopB_summer wk2       9
 5 apple big   shopC_summer wk1       7
 6 apple big   shopC_summer wk2      17
 7 apple big   shopA_winter wk1      10
 8 apple big   shopA_winter wk2      17
 9 apple big   shopA_winter wk3      12
10 apple big   shopB_winter wk1       8

回复收藏 0 原文

~没有更多了~