当隐含 ID 列时,如何合并 csv 文件中的多个数据框?

发布于 2024-08-07 10:32:12 字数 619 浏览 4 评论 0原文

我想将一堆数据框合并在一起(因为如果您只处理一个数据框,则许多操作似乎会更容易,但如果我错了,请纠正我)。

目前我有一个像这样的数据框:

ID, var1, var2
A,  2,    2
B,  4,    5
.
.
Z,  3,    2

每个 ID 都在一行上,带有多个单个测量值

我还有一个 csv 文件,每个 ID 都有重复测量值,例如:

filename = ID_B.csv

time, var4, var5
0,    1,    2
1,    4,    5
2,    1,    6
...

我想要的是:

ID, time, va1, var2, var4, var5
...
B,  0,    4,   5,    1,    2,
B,  1,    4,   5,    4,    5,
B,  2,    4,   5,    1,    6,
...

I并不真正关心列顺序。我能想到的唯一解决方案是将 ID 列添加到每个 csv 文件中,然后多次调用 merge() 循环遍历它们。有更优雅的方法吗?

I'd like to merge a bunch of data frames together (because it seems many operations are easier if you're only dealing w/ one, but correct me if I'm wrong).

Currently I have one data frame like this:

ID, var1, var2
A,  2,    2
B,  4,    5
.
.
Z,  3,    2

Each ID is on a single row w/ several single measurements

I also have a csv file w/ repeated measurement for each ID, like:

filename = ID_B.csv

time, var4, var5
0,    1,    2
1,    4,    5
2,    1,    6
...

What I'd like is:

ID, time, va1, var2, var4, var5
...
B,  0,    4,   5,    1,    2,
B,  1,    4,   5,    4,    5,
B,  2,    4,   5,    1,    6,
...

I don't really care about the column order. The only solution I can think of is to add the ID column to each csv file then loop through them calling merge() several times. Is there a more elegant approach?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

灯下孤影 2024-08-14 10:32:12

我的理解是,您需要从文件名中提取 ID,然后将导入的 csv 与现有数据框合并。

df1 <- read.csv(textConnection("ID, var1, var2
A,  2,    2
B,  4,    5"))

# assuming the imported csv-files are in working directory
filenames <- list.files(getwd(), pattern = "ID_[A-Z].csv")

# extract ID from filename
ids <- gsub("ID_([A-Z]).csv", "\\1", filenames)

# import csv-files and append ID
library(plyr)
import <- mdply(filenames, read.csv)
import$ID <- ids[import$Var1]
import$Var1 <- NULL

# merge imported csv-files and the existing dataframe
merge(df1, import)  

结果:

ID var1 var2 time var4 var5
1  B    4    5    0    1    2
2  B    4    5    1    4    5
3  B    4    5    2    1    6

My understanding is that you need to extract the ID from the filename, and then merge the imported csv with the existing dataframe.

df1 <- read.csv(textConnection("ID, var1, var2
A,  2,    2
B,  4,    5"))

# assuming the imported csv-files are in working directory
filenames <- list.files(getwd(), pattern = "ID_[A-Z].csv")

# extract ID from filename
ids <- gsub("ID_([A-Z]).csv", "\\1", filenames)

# import csv-files and append ID
library(plyr)
import <- mdply(filenames, read.csv)
import$ID <- ids[import$Var1]
import$Var1 <- NULL

# merge imported csv-files and the existing dataframe
merge(df1, import)  

Result:

ID var1 var2 time var4 var5
1  B    4    5    0    1    2
2  B    4    5    1    4    5
3  B    4    5    2    1    6
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文