如何使用 Tidyverse 将数据框连接到另一个数据框而不更改值
我正在处理多个 data.frames
,我必须将它们相互连接起来,这正是我对 tidyverse 的
来说明我有以下数据集full_join
函数的期望
name<-c("AAA","AAA","AAA")
value<-c(1:3)
tag<-c(0,0,0)
part_a<-data.frame(name,value,tag)
name<-c("AAA","AAA","AAA")
value<-c(1:3)
key<-c(1,1,1)
part_b<-data.frame(name,value,key)
我想要的输出将是这样的:
name | value | tag | key |
---|---|---|---|
AAA | 1 | 0 | NA |
AAA | 2 | 0 | NA |
AAA | 3 | 0 | NA |
AAA | 1 | NA | 1 |
AAA | 2 | NA | 1 |
AAA | 3 | NA | 1 |
但我得到了这个:
> full_join(part_a,part_b)
Joining, by = c("name", "value")
name value tag key
1 AAA 1 0 1
2 AAA 2 0 1
3 AAA 3 0 1
这让我很困惑,因为我认为这个函数试图找到共同的值,然后聚合其余的数据,但我真正想要的只是将所有 dataframes
彼此重叠,包括它们没有共同的列,我知道我不能使用 rbind ,因为这个函数要求 dataframes 具有相同的列名,我会这样谢谢你们可以帮我吗!
I am working a multiple data.frames
that I have to join on top of each other, which is kind of what I would expect from the function full_join
of tidyverse
to illustrate I have the following datasets
name<-c("AAA","AAA","AAA")
value<-c(1:3)
tag<-c(0,0,0)
part_a<-data.frame(name,value,tag)
name<-c("AAA","AAA","AAA")
value<-c(1:3)
key<-c(1,1,1)
part_b<-data.frame(name,value,key)
My desired output would be something like this:
name | value | tag | key |
---|---|---|---|
AAA | 1 | 0 | NA |
AAA | 2 | 0 | NA |
AAA | 3 | 0 | NA |
AAA | 1 | NA | 1 |
AAA | 2 | NA | 1 |
AAA | 3 | NA | 1 |
but instead I am getting this:
> full_join(part_a,part_b)
Joining, by = c("name", "value")
name value tag key
1 AAA 1 0 1
2 AAA 2 0 1
3 AAA 3 0 1
Which is very confusing to me as I think this function is trying to find common values and then aggreate the rest of the data but what I really want is just to put all dataframes
on top of each other including the columns that they do not have in common, I know I cannot use rbind
since this function requires dataframes to have the same column names, I would be so thankfull if you guys can help me out!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
full_join()
正在合并您的数据框。由于您没有指定用作标识符的列,因此它使用公共字段(即name
和value
)。要简单地组合数据帧,您可以使用dplyr
的bind_rows()
:result <-bind_rows(part_a,part_b)
请注意,还有
bind_cols()
用于将多个数据帧中的变量(即列)合并为一个数据帧。full_join()
is merging your data frames. Since you didn't specify the columns to use as identifiers, it's using the common fields (i.e.,name
andvalue
). To simply combine data frames you can usedplyr
'sbind_rows()
:result <- bind_rows(part_a, part_b)
Note that there's also a
bind_cols()
for combining variables (i.e., columns) from multiple data frames into a single one.