r使用字符串重塑宽至长

发布于 2025-01-30 18:29:46 字数 1233 浏览 3 评论 0原文

我有一个表格：

cou	自己的	ind	aus_d_a	use_f_b
aus	d	a	3268.02	85.087
aus	d	b	92.1329	10808.3

，我想将其重塑为长键入数据，如下所示，表B：表B：

cou2	a a	ind	cou2	ind cou2 ind2	ind2	value
ind2	ind2	ind2	a aus aus aus aus aus aus aus aus aus aus aus aus aus aus aus aus aus	D	A	3268.02
AUS	D	B AUS	D	A	92.1329	AUS
D	A	USA	F	B	85.087	AUS
D	B	USA F USA F USA F B	USA	F	B	10808.3

，但我不知道如何使用R或Stata对其进行编码？任何人都可以帮助我，非常感谢

PS：数据只是一个示例，实际上我有数千列（三个维度：country_ownhips_industry，例如AUS_D_C21），60个国家 /地区，2个所有权，34个行业，所以我有4080个COLS。

原文

I have a table A like this:

cou	own	ind	aus_d_a	usa_f_b
AUS	D	A	3268.02	85.087
AUS	D	B	92.1329	10808.3

and I want to reshape it to long type data as follows, table B:

cou	own	ind	cou2	own2	ind2	value
AUS	D	A	aus	d	a	3268.02
AUS	D	B	aus	d	a	92.1329
AUS	D	A	usa	f	b	85.087
AUS	D	B	usa	f	b	10808.3

but I don't know how to code it using R or Stata? anyone can help me, thanks a lot

PS: the data is just a sample, actually I have thousands of columns (three dimensions: country_ownership_industry, eg aus_d_c21), 60 countries, 2 ownership, 34 industries, so I have 4080 cols.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

决绝 2025-02-06 18:29:46

只要您的country_ownhips_industry列是其列名中下划线的唯一列，您可以做：

library(tidyr)
library(dplyr)

pivot_longer(df, contains('_'), names_sep = '_', names_to = c('cou2', 'own2', 'ind2'))
#> # A tibble: 4 x 7
#>   cou   own   ind   cou2  own2  ind2    value
#>   <chr> <chr> <chr> <chr> <chr> <chr>   <dbl>
#> 1 AUS   D     A     aus   d     a      3268. 
#> 2 AUS   D     A     usa   f     b        85.1
#> 3 AUS   D     B     aus   d     a        92.1
#> 4 AUS   D     B     usa   f     b     10808.

Provided your country_ownership_industry columns are the only ones with underscores in their column names, you can do:

library(tidyr)
library(dplyr)

pivot_longer(df, contains('_'), names_sep = '_', names_to = c('cou2', 'own2', 'ind2'))
#> # A tibble: 4 x 7
#>   cou   own   ind   cou2  own2  ind2    value
#>   <chr> <chr> <chr> <chr> <chr> <chr>   <dbl>
#> 1 AUS   D     A     aus   d     a      3268. 
#> 2 AUS   D     A     usa   f     b        85.1
#> 3 AUS   D     B     aus   d     a        92.1
#> 4 AUS   D     B     usa   f     b     10808.

回复收藏 0 原文

不寐倦长更 2025-02-06 18:29:46

使用pivot_longer和独立的组合：

# using your own example data
dat1 <- tibble(
  cou = c('AUS', 'AUS'), 
  own = c('D', 'D'), 
  ind = c('A', 'B'), 
  aus_d_a = c(3268.02, 92.1329), 
  usa_f_b = c(85.087, 10808.3)
)

library(tidyverse)

dat1 %>%
  pivot_longer(cols = aus_d_a:usa_f_b, names_to = 'cou2', values_to = 'value') %>%
  separate(cou2, c('cou2', 'own2', 'ind2'), sep = '_')

Using a combination of pivot_longer and separate:

# using your own example data
dat1 <- tibble(
  cou = c('AUS', 'AUS'), 
  own = c('D', 'D'), 
  ind = c('A', 'B'), 
  aus_d_a = c(3268.02, 92.1329), 
  usa_f_b = c(85.087, 10808.3)
)

library(tidyverse)

dat1 %>%
  pivot_longer(cols = aus_d_a:usa_f_b, names_to = 'cou2', values_to = 'value') %>%
  separate(cou2, c('cou2', 'own2', 'ind2'), sep = '_')

回复收藏 0 原文

~没有更多了~