通过r组中的组中的不同数据集匹配值
我有以下两个数据集:
df1 <- data.frame(
"group" = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5),
"numbers" = c(55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60))
df2 <- data.frame(
"group" = c(1, 1, 2, 2, 2, 3, 3, 4, 5),
"P1" = c(55, NA, 60, 55, 75, 75, 55, 55, 60),
"P2" = c(55, 75, 55, 60, NA, 75, 55, NA, 60),
"P3" = c(75, 55, 60, 75, NA, 75, 60, 55, 60))
在DF1中,每个组具有相同的三个数字(实际上大约有500个数字)。
我想检查DF1中“数字”中的“数字”中的值是否包含在DF2的P1,P2和P3中。我遇到了两个问题。 1。DF1的数字列中的值可以发生在DF2的不同组中(由DF1和DF2中的组列定义)。 2。数据集具有不同的长度。是否有一种方法可以合并两个数据集并具有以下数据集:
df3 <- data.frame(
"group" = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5),
"numbers" = c(55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60,),
"P1new" = c(1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1),
"P2new" = c(1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1),
"P3new" = c(1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1))
如果df2 $ p1包含在正确组中的df1 $中的值,则p1new(分别为p2new和p3new)包含值1组)。例如,P3在第1组中具有75个值,但在第5组中没有值。因此,在第1组中,P3New将具有1个,在第5组中,P3New中的P3NEW中的P3 New将具有0。 这个问题类似于在R中查找不同数据集中的匹配值 ,但我无法根据我的目标调整代码。因此,我真的很感谢任何帮助。
I have the following two datasets:
df1 <- data.frame(
"group" = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5),
"numbers" = c(55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60))
df2 <- data.frame(
"group" = c(1, 1, 2, 2, 2, 3, 3, 4, 5),
"P1" = c(55, NA, 60, 55, 75, 75, 55, 55, 60),
"P2" = c(55, 75, 55, 60, NA, 75, 55, NA, 60),
"P3" = c(75, 55, 60, 75, NA, 75, 60, 55, 60))
In df1 each group has the same three numbers (in reality there are around 500 numbers).
I want to check whether the values in the column "numbers" in df1 are contained in the columns P1, P2, and P3 of df2. There are two problems I am stuck with. 1. the values in the numbers column of df1 can occur in different groups in df2 (defined by the group column in df1 and df2). 2. the datasets have different lengths. Is there a way to merge both datasets and have the following dataset:
df3 <- data.frame(
"group" = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5),
"numbers" = c(55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60, 55, 75, 60,),
"P1new" = c(1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1),
"P2new" = c(1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1),
"P3new" = c(1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1))
where P1new (P2new and P3new respectively) contain the value 1 if df2$P1 contains the value in df1$numbers within the correct group (as I said numbers can reoccur in different groups). For example, P3 has the value 75 in group 1 but not in group 5. So in group 1 P3new would have a 1 and in group 5 P3new would have a 0.
This question is similar to Find matching values in different datasets by groups in R
but I could not adapt the code according to my objectives. So, I would really appreciate any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
有趣的问题。这是
dplyr
函数的方法:Interesting question. Here's a way with
dplyr
functions:另一个可能的解决方案:
或者,没有
purrr
,是另一种可能性:Another possible solution:
Or, without
purrr
, another possibility: