R:将行与相同的ID结合
编辑:我将var4更改为字符串值,因为我的问题对我的数据还不够精确,因此由于无效的类型,答案失败了。抱歉,
这是我在这里的第一个问题,希望有人可以帮助我。
我有以下数据集:
id | 日期 | n_date | var1 | var2 | var3 | var4 | 类型 |
---|---|---|---|---|---|---|---|
1 | 4.7.22 | 50000 | 12 | na na | na | na | nalum natry |
1 | 4.7.22 | 50000 | na | 23 | na na na na | na na na | 正常 |
1 | 4.7.22 | 50000 | na 5 na 5 na 5 na 5 na 5 na | na na | 5 | na | 5 |
na | 4.7.22 | 50000 | NA | NA | NA ASD | 正常 | 3 |
2 | 4.7.22 | 50000 | NA | 2 | Na Na NA NA NA NA NA NA | NA NA NA NA | 正常 |
5.7.22 | 20000 | 7 | NA | NA | NA | NA | NARANOM |
我的目标是每个ID只有一排。因此,我想做的是将每个ID的VAR列值移动或以某种方式将它们组合起来。如您所见,目前,每行的VAR列中永远不会有一个以上的值。因此,使用相应的“实际值”重写NAS应该很容易。我还发现了类似的问题,但在我的情况下,答案没有帮助:
我认为我的情况是,我的列具有诸如“ date”,“ n_date”之类的列(这是对该日期的观察)和“类型”。在这些情况下,我的代码应该看到,它对于相应的ID完全相同,例如以第一个值为例。
因此,最终我只有3行,其中包含所有信息,其中包含相同数量的列。
非常感谢任何有想法解决这个问题的人。
Edit: I changed Var4 to a string value as my question was not precise enough about my data and therefore answers were failing because of invalid types. Sorry for that
this is my first question here and I hope someone can help me.
I have the following data set:
ID | Date | N_Date | Var1 | Var2 | Var3 | Var4 | type |
---|---|---|---|---|---|---|---|
1 | 4.7.22 | 50000 | 12 | NA | NA | NA | normal |
1 | 4.7.22 | 50000 | NA | 23 | NA | NA | normal |
1 | 4.7.22 | 50000 | NA | NA | 5 | NA | normal |
1 | 4.7.22 | 50000 | NA | NA | NA | asd | normal |
2 | 4.7.22 | 50000 | NA | 2 | NA | NA | normal |
3 | 5.7.22 | 20000 | 7 | NA | NA | NA | normal |
My goal is to have just one row for each ID. So what I want R to do, is to shift the Var column values for each ID up or somehow combine them. As you can see, at the moment, there is never more than one value in a Var column for each row. So it should be easy to rewrite the NAs with the corresponding "real value". I also found similiar questions but the answer did not help in my case:
How to combine rows with the same identifier R?
I think the problem in my case is, that I have columns like "date", "N_date" (which is the number of observations on that date) and "type". In these cases my code should see, that it is exactly the same value for the corresponding ID, and just take the first value for example.
So that in the end I just have 3 rows with same number of columns, containing all information.
Thank you very much for anyone who has an idea how to solve this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这样的事情:
在这里,我们首先要组的所有组,除了
var
变量,然后我们使用摘要(跨...
,如@limey在注释部分中所建议的。主要功能是使用
na.rm = true
:Something like this:
Here we first group for all except the
Var
variables, then we usesummarise(across...
as suggested by @Limey in the comments section.Main feature is to use
na.rm=TRUE
: