删除某一特定列中具有空白值的行

发布于 2025-01-02 21:23:05 字数 547 浏览 1 评论 0原文

我正在处理一个大型数据集,其中一些行带有 NA,其他行带有空白:

df <- data.frame(ID = c(1:7),                                   
         home_pc = c("","CB4 2DT", "NE5 7TH", "BY5 8IB", "DH4 6PB","MP9 7GH","KN4 5GH"),               
         start_pc = c(NA,"Home", "FC5 7YH","Home", "CB3 5TH", "BV6 5PB",NA),               
         end_pc = c(NA,"CB5 4FG","Home","","Home","",NA))

如何一次性删除 NA 和空白(在 start_pc 和 end_pc 列中)?我过去曾使用过:

df<- df[-which(is.na(df$start_pc)), ]

... 来删除 NA - 是否有类似的命令来删除空白?

I am working on a large dataset, with some rows with NAs and others with blanks:

df <- data.frame(ID = c(1:7),                                   
         home_pc = c("","CB4 2DT", "NE5 7TH", "BY5 8IB", "DH4 6PB","MP9 7GH","KN4 5GH"),               
         start_pc = c(NA,"Home", "FC5 7YH","Home", "CB3 5TH", "BV6 5PB",NA),               
         end_pc = c(NA,"CB5 4FG","Home","","Home","",NA))

How do I remove the NAs and blanks in one go (in the start_pc and end_pc columns)? I have in the past used:

df<- df[-which(is.na(df$start_pc)), ]

... to remove the NAs - is there a similar command to remove the blanks?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

甲如呢乙后呢 2025-01-09 21:23:05
 df[!(is.na(df$start_pc) | df$start_pc==""), ]
 df[!(is.na(df$start_pc) | df$start_pc==""), ]
雪化雨蝶 2025-01-09 21:23:05

它是相同的构造 - 只需测试空字符串而不是 NA

试试这个:

df <- df[-which(df$start_pc == ""), ]

事实上,查看您的代码,您不需要 which,而是使用当然

df <- df[!(df$start_pc == ""), ]
df <- df[!is.na(df$start_pc), ]

,您可以将这两个语句组合起来,如下所示:

df <- df[!(df$start_pc == "" | is.na(df$start_pc)), ]

并使用 with 进一步简化:

df <- with(df, df[!(start_pc == "" | is.na(start_pc)), ])

您还可以测试非零字符串长度使用nzchar

df <- with(df, df[!(nzchar(start_pc) | is.na(start_pc)), ])

免责声明:我没有测试任何此代码。如果有语法错误请告诉我

It is the same construct - simply test for empty strings rather than NA:

Try this:

df <- df[-which(df$start_pc == ""), ]

In fact, looking at your code, you don't need the which, but use the negation instead, so you can simplify it to:

df <- df[!(df$start_pc == ""), ]
df <- df[!is.na(df$start_pc), ]

And, of course, you can combine these two statements as follows:

df <- df[!(df$start_pc == "" | is.na(df$start_pc)), ]

And simplify it even further with with:

df <- with(df, df[!(start_pc == "" | is.na(start_pc)), ])

You can also test for non-zero string length using nzchar.

df <- with(df, df[!(nzchar(start_pc) | is.na(start_pc)), ])

Disclaimer: I didn't test any of this code. Please let me know if there are syntax errors anywhere

牵强ㄟ 2025-01-09 21:23:05

dplyr 的一个优雅的解决方案是:

df %>%
  # recode empty strings "" by NAs
  na_if("") %>%
  # remove NAs
  na.omit

An elegant solution with dplyr would be:

df %>%
  # recode empty strings "" by NAs
  na_if("") %>%
  # remove NAs
  na.omit
錯遇了你 2025-01-09 21:23:05

另一种解决方案是删除一个变量中带有空格的行:

df <- subset(df, VAR != "")

Alternative solution can be to remove the rows with blanks in one variable:

df <- subset(df, VAR != "")
贪恋 2025-01-09 21:23:05

一种简单的方法是使所有空白单元格 NA 并仅保留完整的案例。您还可以查找 na.omit 示例。这是一个广泛讨论的话题。

df[df==""]<-NA
df<-df[complete.cases(df),]

An easy approach would be making all the blank cells NA and only keeping complete cases. You might also look for na.omit examples. It is a widely discussed topic.

df[df==""]<-NA
df<-df[complete.cases(df),]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文