如何为列表中的每个元素导出 csv 文件并根据另一个数据集的匹配值命名文件？

发布于 2025-01-10 06:25:29 字数 1928 浏览 1 评论 0原文

我有一个包含数百种疾病的代码列表，以及每种疾病的多个代码。我需要为每种疾病保存一个单独的代码列表，并以特定的代码列表名称命名。

我使用以下命令为每种疾病创建了一个大型列表向量：

Disease <- as.character(c("HIV","HIV", "HIV", "HIV", "anaemia", "anaemia", "anaemia", "Chronic Kidney Disease", "Chronic Kidney Disease"))
Code <- c(123, 432, 567, 876, 433, 096, 543, 912, 456)
codelist <- data.frame(Disease, Code)
codelist
                 Disease Code
1                    HIV  123
2                    HIV  432
3                    HIV  567
4                    HIV  876
5                anaemia  433
6                anaemia   96
7                anaemia  543
8 Chronic Kidney Disease  912
9 Chronic Kidney Disease  456

    list <- codelist %>%
      dplyr::group_split(Disease)

这为我提供了每种疾病的一个列表：

> list
<list_of<
  tbl_df<
    Disease: character
    Code   : double
  >
>[3]>
[[1]]
# A tibble: 3 × 2
  Disease  Code
  <chr>   <dbl>
1 anaemia   433
2 anaemia    96
3 anaemia   543

[[2]]
# A tibble: 2 × 2
  Disease                 Code
  <chr>                  <dbl>
1 Chronic Kidney Disease   912
2 Chronic Kidney Disease   456

[[3]]
# A tibble: 4 × 2
  Disease  Code
  <chr>   <dbl>
1 HIV       123
2 HIV       432
3 HIV       567
4 HIV       876

另外，我有一个 df ，其中包含根据疾病的每个代码列表的名称：

Disease <- as.character(c("anaemia", "Chronic Kidney Disease", "HIV"))
File_name <- c("ICD_anemia_2010", "ICD_CKD_2022", "ICD_HIV_2010")
Codelists_names <- data.frame(Disease, File_name)
Codelists_names
                 Disease       File_name
1                anaemia ICD_anemia_2010
2 Chronic Kidney Disease    ICD_CKD_2022
3                    HIV    ICD_HIV_2010

我会喜欢为每种疾病导出单独的 csv 文件，根据匹配的 Codelists_names 中的 column File_name 命名每个文件那个特定的疾病。

请问我该怎么做？非常感谢。

原文

I have a codelist with hundreds of Diseases, and multiple codes for each Disease. I need to save one individual codelist for each Disease and name it after a specific codelist name.

I created a large vector of lists for each Disease using the command below:

Disease <- as.character(c("HIV","HIV", "HIV", "HIV", "anaemia", "anaemia", "anaemia", "Chronic Kidney Disease", "Chronic Kidney Disease"))
Code <- c(123, 432, 567, 876, 433, 096, 543, 912, 456)
codelist <- data.frame(Disease, Code)
codelist
                 Disease Code
1                    HIV  123
2                    HIV  432
3                    HIV  567
4                    HIV  876
5                anaemia  433
6                anaemia   96
7                anaemia  543
8 Chronic Kidney Disease  912
9 Chronic Kidney Disease  456

    list <- codelist %>%
      dplyr::group_split(Disease)

This give me one list for each Disease:

> list
<list_of<
  tbl_df<
    Disease: character
    Code   : double
  >
>[3]>
[[1]]
# A tibble: 3 × 2
  Disease  Code
  <chr>   <dbl>
1 anaemia   433
2 anaemia    96
3 anaemia   543

[[2]]
# A tibble: 2 × 2
  Disease                 Code
  <chr>                  <dbl>
1 Chronic Kidney Disease   912
2 Chronic Kidney Disease   456

[[3]]
# A tibble: 4 × 2
  Disease  Code
  <chr>   <dbl>
1 HIV       123
2 HIV       432
3 HIV       567
4 HIV       876

Also, I have a df with the names for each codelist according to the Disease:

Disease <- as.character(c("anaemia", "Chronic Kidney Disease", "HIV"))
File_name <- c("ICD_anemia_2010", "ICD_CKD_2022", "ICD_HIV_2010")
Codelists_names <- data.frame(Disease, File_name)
Codelists_names
                 Disease       File_name
1                anaemia ICD_anemia_2010
2 Chronic Kidney Disease    ICD_CKD_2022
3                    HIV    ICD_HIV_2010

I would like to export an individual csv file for each Disease naming each file according to the column File_name from Codelists_names that matches that specific Disease.

How could I do that, please? Many thanks.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

魄砕の薆 2025-01-17 06:25:29

这里基本上有两个步骤：

使用 dplyr::left_join() 合并文件名，然后使用 File_name 合并 base::split()。这将为您提供一个命名的数据帧列表，其中每个名称都是要保存到的文件名。
使用 purrr::iwalk() 迭代列表，将每个数据帧保存到其名称中指定的路径。

library(tidyverse)

codelist %>% 
  left_join(Codelists_names, by = "Disease") %>% 
  split(.$File_name) %>% 
  map(~ select(.x, !File_name)) %>% 
  iwalk(~ write_csv(.x, str_c(.y, ".csv")))

# check results
list.files(pattern = "csv")
# "ICD_anemia_2010.csv" "ICD_CKD_2022.csv"    "ICD_HIV_2010.csv"

read_csv("ICD_HIV_2010.csv")
# # A tibble: 4 x 2
#   Disease  Code
#   <chr>   <dbl>
# 1 HIV       123
# 2 HIV       432
# 3 HIV       567
# 4 HIV       876

Basically two steps here:

Merge in your filenames using dplyr::left_join(), then base::split() by File_name. This will give you a named list of dataframes, where each name is the filename to be saved to.
Use purrr::iwalk() to iterate over the list, saving each dataframe to the path specified in its name.

library(tidyverse)

codelist %>% 
  left_join(Codelists_names, by = "Disease") %>% 
  split(.$File_name) %>% 
  map(~ select(.x, !File_name)) %>% 
  iwalk(~ write_csv(.x, str_c(.y, ".csv")))

# check results
list.files(pattern = "csv")
# "ICD_anemia_2010.csv" "ICD_CKD_2022.csv"    "ICD_HIV_2010.csv"

read_csv("ICD_HIV_2010.csv")
# # A tibble: 4 x 2
#   Disease  Code
#   <chr>   <dbl>
# 1 HIV       123
# 2 HIV       432
# 3 HIV       567
# 4 HIV       876

回复收藏 0 原文

~没有更多了~