将数据从一个数据帧提取到另一个具有不同行长度的数据帧

发布于 2024-12-03 18:46:27 字数 756 浏览 1 评论 0原文

我有两个 data.frames,如下:

df1 <- data.frame(A=c("lee","eeu","ees"), B=c("lee","ggu","1su"), C=c(1,1,1)

    A   B C
1 lee lee 1
2 eeu ggu 1
3 ees 1su 1


df2 <- data.frame (X=c("lee","1su","eeu","ggu"), Y=c("3k3","4k4","5k","2ee"), Z=c("ggg","","","ooo"), ZA=c("vvv","","",""))

    X   Y   Z  ZA
1 lee 3k3 ggg vvv
2 1su 4k4        
3 eeu  5k        
4 ggu 2ee ooo    

我想通过将 df1$B 与 df2$X 匹配来扩展 df1。当 df1$B = df2$X 时,我想向 new_df1 添加额外的行,其中 new B = df2 中同一行的其他条目,但保持 A 和 C 相同。

new_df1预计如下:

 A   B  C
lee 3k3 1 ### df1$B1= df2$X1= lee
lee ggg 1
lee vvv 1
eeu 2ee 1 ### df1$B2= df2$X4= ggu
eeu ooo 1
ees 4k4 1 ### df1$B3= df2$X2= lsu

我过去使用lapply的经验似乎对内存要求很高,是否可以不使用lapply来完成?

I have two data.frames as follows:

df1 <- data.frame(A=c("lee","eeu","ees"), B=c("lee","ggu","1su"), C=c(1,1,1)

    A   B C
1 lee lee 1
2 eeu ggu 1
3 ees 1su 1


df2 <- data.frame (X=c("lee","1su","eeu","ggu"), Y=c("3k3","4k4","5k","2ee"), Z=c("ggg","","","ooo"), ZA=c("vvv","","",""))

    X   Y   Z  ZA
1 lee 3k3 ggg vvv
2 1su 4k4        
3 eeu  5k        
4 ggu 2ee ooo    

I want to expand df1 by matching df1$B with df2$X. When df1$B = df2$X, I want to add additional rows to the new_df1 with new B = other entries in df2 on the same row, but keeping A and C the same.

new_df1 is expected to be as follows:

 A   B  C
lee 3k3 1 ### df1$B1= df2$X1= lee
lee ggg 1
lee vvv 1
eeu 2ee 1 ### df1$B2= df2$X4= ggu
eeu ooo 1
ees 4k4 1 ### df1$B3= df2$X2= lsu

My past experience on using lapply seems to be very memory-demanding, is it possible to be done without using lapply?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

就此别过 2024-12-10 18:46:27

我认为你想要的是这个的子集:

require(reshape2)
merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X", all=TRUE)
     B    A  C variable value
1  1su  ees  1        Y   4k4
2  1su  ees  1        Z      
3  1su  ees  1       ZA      
4  ggu  eeu  1        Y   2ee
5  ggu  eeu  1        Z   ooo
6  ggu  eeu  1       ZA      
7  lee  lee  1        Y   3k3
8  lee  lee  1        Z   ggg
9  lee  lee  1       ZA   vvv
10 eeu <NA> NA        Y    5k
11 eeu <NA> NA        Z      
12 eeu <NA> NA       ZA      

我将该对象分配给“M1”(后来注意到它不需要 all=TRUE)

M1 <- merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X")
subset(M1, value != "" , select=c(A,value, C) )
    A value C
1 ees   4k4 1
4 eeu   2ee 1
5 eeu   ooo 1
7 lee   3k3 1
8 lee   ggg 1
9 lee   vvv 1

I think what you wnat is a subset of this:

require(reshape2)
merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X", all=TRUE)
     B    A  C variable value
1  1su  ees  1        Y   4k4
2  1su  ees  1        Z      
3  1su  ees  1       ZA      
4  ggu  eeu  1        Y   2ee
5  ggu  eeu  1        Z   ooo
6  ggu  eeu  1       ZA      
7  lee  lee  1        Y   3k3
8  lee  lee  1        Z   ggg
9  lee  lee  1       ZA   vvv
10 eeu <NA> NA        Y    5k
11 eeu <NA> NA        Z      
12 eeu <NA> NA       ZA      

I assigned that object to "M1" (and later noticed that it did not need all=TRUE)

M1 <- merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X")
subset(M1, value != "" , select=c(A,value, C) )
    A value C
1 ees   4k4 1
4 eeu   2ee 1
5 eeu   ooo 1
7 lee   3k3 1
8 lee   ggg 1
9 lee   vvv 1
红焚 2024-12-10 18:46:27

我将使用 reshape 包中的 Melt() 来完成此任务。

 melt(df2, c("X"))
     X variable value
1  lee        Y   3k3
2  1su        Y   4k4
3  eeu        Y    5k
4  ggu        Y   2ee
5  lee        Z   ggg
6  1su        Z      
7  eeu        Z      
8  ggu        Z   ooo
9  lee       ZA   vvv
10 1su       ZA      
11 eeu       ZA      
12 ggu       ZA      

x <- melt(df2, c("X"))
x$variable <- NULL
x$C <- 1
colnames(x) <- c("A","B","C")

现在子集和 rbind()

x <- subset(x, B != "")
newdf <- rbind(df1, x)

I would use melt() from the reshape package for this task.

 melt(df2, c("X"))
     X variable value
1  lee        Y   3k3
2  1su        Y   4k4
3  eeu        Y    5k
4  ggu        Y   2ee
5  lee        Z   ggg
6  1su        Z      
7  eeu        Z      
8  ggu        Z   ooo
9  lee       ZA   vvv
10 1su       ZA      
11 eeu       ZA      
12 ggu       ZA      

x <- melt(df2, c("X"))
x$variable <- NULL
x$C <- 1
colnames(x) <- c("A","B","C")

Now subset, and rbind()

x <- subset(x, B != "")
newdf <- rbind(df1, x)
递刀给你 2024-12-10 18:46:27

有更简单的方法可以做到这一点...使用 match 函数。

df1$Y <- df2$Y[match(df1$B, df2$X)]

您也可以将其扩展到其他列。

There is much better easy to do this...use the match function.

df1$Y <- df2$Y[match(df1$B, df2$X)]

You can expand it to other columns too.

深爱不及久伴 2024-12-10 18:46:27
#example data.frames
d <- data.frame(a=c(1:10), b=c(1:10))
e <- data.frame(a=c(5:1), b=c(5:1))

#add row number of reference data.frame
d$row <- c(1:nrow(d))

#merge data.frames by desired columns
m<- merge.data.frame(d,e,by=c("a","b"))

#check results
m$row
d$row %in% m$row
#example data.frames
d <- data.frame(a=c(1:10), b=c(1:10))
e <- data.frame(a=c(5:1), b=c(5:1))

#add row number of reference data.frame
d$row <- c(1:nrow(d))

#merge data.frames by desired columns
m<- merge.data.frame(d,e,by=c("a","b"))

#check results
m$row
d$row %in% m$row
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文