处理 apply 和 unique 中的 NA 值
我有一个 114 行 x 16 列的数据框,其中行是个人,列是他们的名字或 NA。例如,前 3 行如下所示:
name name.1 name.2 name.3 name.4 name.5 name.6 name.7 name.8 name.9 name.10 name.11 name.12 name.13 name.14 name.15
1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Aanestad <NA> Aanestad <NA> Aanestad <NA>
2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Ackerman <NA> Ackerman <NA> Ackerman <NA> Ackerman <NA>
3 <NA> <NA> <NA> <NA> <NA> <NA> Alarcon <NA> Alarcon <NA> Alarcon <NA> Alarcon <NA> <NA> <NA>
我想生成所有唯一名称的列表(如果每行有多个唯一名称)或向量(如果每行只有一个唯一名称),长度为 114。
当我尝试 < code>apply(x,1,unique) 我得到一个 2xNcol 数组,其中有时第一行单元格为 NA,有时第二行单元格为 NA。
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA NA "Alquist" NA "Ayala" NA NA
[2,] "Aanestad" "Ackerman" "Alarcon" "Alpert" NA "Ashburn" NA "Baca" "Battin"
当我想要的只是:
Aanestad
Ackerman
Alarcon
...
我似乎无法弄清楚如何在忽略 NA 的同时应用 unique() 。 na.rm、na.omit 等似乎不起作用。我觉得我错过了一些非常简单的东西......
谢谢!
I have a 114 row by 16 column data frame where the rows are individuals, and the columns are either their names or NA. For example, the first 3 rows looks like this:
name name.1 name.2 name.3 name.4 name.5 name.6 name.7 name.8 name.9 name.10 name.11 name.12 name.13 name.14 name.15
1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Aanestad <NA> Aanestad <NA> Aanestad <NA>
2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Ackerman <NA> Ackerman <NA> Ackerman <NA> Ackerman <NA>
3 <NA> <NA> <NA> <NA> <NA> <NA> Alarcon <NA> Alarcon <NA> Alarcon <NA> Alarcon <NA> <NA> <NA>
I want to generate a list (if multiple unique names per row) or vector (if only one unique name per row) of all the unique names, with length 114.
When I try apply(x,1,unique)
I get a 2xNcol array where sometimes the first row cell is NA and sometimes the second row cell is NA.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA NA "Alquist" NA "Ayala" NA NA
[2,] "Aanestad" "Ackerman" "Alarcon" "Alpert" NA "Ashburn" NA "Baca" "Battin"
When what I'd like is just:
Aanestad
Ackerman
Alarcon
...
I can't seem to figure out how to apply unique() while ignoring NA. na.rm, na.omit etc don't seem to work. I feel like I'm missing something real simple ...
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
unique
似乎没有na.rm
参数,但您可以在调用它之前自行删除缺失的值:
unique
does not appear to have anna.rm
argument, but you can remove the missing values yourself before calling it:gives
您非常非常接近最初的解决方案。但正如 Aniko 所说,您必须先删除
NA
值,然后才能使用 unique。我们首先创建一个类似的
data.frame
的示例,然后像您一样使用apply()
——但使用了一个用于组合的附加匿名函数na.omit() 和 unique():
You were very, very close in your initial solution. But as Aniko remarked, you have to remove
NA
values before you can use unique.An example where we first create a similar
data.frame
and then useapply()
as you did -- but with an additional anonymous function that is used to combinena.omit()
andunique()
: