根据r中的第一个数据集的值,将变量的新变量添加添加新变量
我有一个数据集“ DF”,具有许多观察结果和多个变量,包括一些邮政编码(在某些情况下重复几次)和不同的数据集“ DF2”,其中这些邮政代码的坐标。我想在我的第一个数据集“ DF”中添加两个新变量,并具有这些邮政编码的坐标,但是鉴于我拥有的大量数据,循环需要太长。我想知道我是否可以在维护数据帧结构而不变成矩阵的同时以某种方式进行矢量化。我附上了我想要实现的简化版本。
# This dataset has my variables (removed the rest for simplicity)
df <- data.frame(pc = c("00001", "00002", "00003",
"00001", "00002", "00003",
"00001", "00002", "00003"))
pc
1 00001
2 00002
3 00003
4 00001
5 00002
6 00003
7 00001
8 00002
9 00003
#This dataset holds the coordinates
df2 <- data.frame(pc = c("00001", "00002", "00003"),
lat = c(1, 2, 3),
long = c(4, 5, 6))
pc lat long
1 00001 1 4
2 00002 2 5
3 00003 3 6
#This is the dataset I need
good.df <- data.frame(pc = c("00001", "00002", "00003",
"00001", "00002", "00003",
"00001", "00002", "00003"),
lat = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
long = c(4, 5, 6, 4, 5, 6, 4, 5, 6))
pc lat long
1 00001 1 4
2 00002 2 5
3 00003 3 6
4 00001 1 4
5 00002 2 5
6 00003 3 6
7 00001 1 4
8 00002 2 5
9 00003 3 6
我已经搜索了很长一段时间的解决方案,但是考虑到我不知道如何正确提出到目前为止我没有成功的问题。我真的很感谢这里的一些指导。
谢谢
I have a dataset "df" with many observations and multiple variables including some postal codes (repeated several times in some cases) and a different dataset "df2" with the coordinates of these postal codes. I want to add two new variables to my first dataset "df" with the coordinates of these postal codes but, given that huge amount of data I have, it takes too long with a loop. I would like to know if I can vectorize it in some way while maintaining the dataframe structure and not changing into matrix. I attach a simplified version of what I want to achieve.
# This dataset has my variables (removed the rest for simplicity)
df <- data.frame(pc = c("00001", "00002", "00003",
"00001", "00002", "00003",
"00001", "00002", "00003"))
pc
1 00001
2 00002
3 00003
4 00001
5 00002
6 00003
7 00001
8 00002
9 00003
#This dataset holds the coordinates
df2 <- data.frame(pc = c("00001", "00002", "00003"),
lat = c(1, 2, 3),
long = c(4, 5, 6))
pc lat long
1 00001 1 4
2 00002 2 5
3 00003 3 6
#This is the dataset I need
good.df <- data.frame(pc = c("00001", "00002", "00003",
"00001", "00002", "00003",
"00001", "00002", "00003"),
lat = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
long = c(4, 5, 6, 4, 5, 6, 4, 5, 6))
pc lat long
1 00001 1 4
2 00002 2 5
3 00003 3 6
4 00001 1 4
5 00002 2 5
6 00003 3 6
7 00001 1 4
8 00002 2 5
9 00003 3 6
I have searched for the solution for quite a long time, but considering I do not know how to properly ask the question I have had no success so far. I would really appreciate some guidance here.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们可以从
dplyr
软件包中使用left_join
。加入PC
:We could use
left_join
fromdplyr
package. Joining bypc
: