删除某一特定列中具有空白值的行
我正在处理一个大型数据集,其中一些行带有 NA,其他行带有空白:
df <- data.frame(ID = c(1:7),
home_pc = c("","CB4 2DT", "NE5 7TH", "BY5 8IB", "DH4 6PB","MP9 7GH","KN4 5GH"),
start_pc = c(NA,"Home", "FC5 7YH","Home", "CB3 5TH", "BV6 5PB",NA),
end_pc = c(NA,"CB5 4FG","Home","","Home","",NA))
如何一次性删除 NA 和空白(在 start_pc 和 end_pc 列中)?我过去曾使用过:
df<- df[-which(is.na(df$start_pc)), ]
... 来删除 NA - 是否有类似的命令来删除空白?
I am working on a large dataset, with some rows with NAs and others with blanks:
df <- data.frame(ID = c(1:7),
home_pc = c("","CB4 2DT", "NE5 7TH", "BY5 8IB", "DH4 6PB","MP9 7GH","KN4 5GH"),
start_pc = c(NA,"Home", "FC5 7YH","Home", "CB3 5TH", "BV6 5PB",NA),
end_pc = c(NA,"CB5 4FG","Home","","Home","",NA))
How do I remove the NAs and blanks in one go (in the start_pc and end_pc columns)? I have in the past used:
df<- df[-which(is.na(df$start_pc)), ]
... to remove the NAs - is there a similar command to remove the blanks?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
它是相同的构造 - 只需测试空字符串而不是
NA
:试试这个:
事实上,查看您的代码,您不需要
which
,而是使用当然,您可以将这两个语句组合起来,如下所示:
并使用
with
进一步简化:您还可以测试非零字符串长度使用
nzchar
。免责声明:我没有测试任何此代码。如果有语法错误请告诉我
It is the same construct - simply test for empty strings rather than
NA
:Try this:
In fact, looking at your code, you don't need the
which
, but use the negation instead, so you can simplify it to:And, of course, you can combine these two statements as follows:
And simplify it even further with
with
:You can also test for non-zero string length using
nzchar
.Disclaimer: I didn't test any of this code. Please let me know if there are syntax errors anywhere
dplyr 的一个优雅的解决方案是:
An elegant solution with dplyr would be:
另一种解决方案是删除一个变量中带有空格的行:
Alternative solution can be to remove the rows with blanks in one variable:
一种简单的方法是使所有空白单元格
NA
并仅保留完整的案例。您还可以查找na.omit
示例。这是一个广泛讨论的话题。An easy approach would be making all the blank cells
NA
and only keeping complete cases. You might also look forna.omit
examples. It is a widely discussed topic.