如何使用是识别na的na,' “” ETC

发布于 2025-01-19 06:11:30 字数 635 浏览 1 评论 0原文

我有 2 个问题:

问题 1:我正在尝试找出如何识别任何常见的缺失值格式,例如 NA、" "、""。

我认为 is.na 会识别所有这些格式,有人可以为我指明我需要在这里做什么的正确方向吗?

问题 2:我需要计算 NA、" " 和 "" 值并列出所有这些值的位置。

我尝试过:

```{r, echo=TRUE,include=TRUE}
sum(is.na(DF))
which(is.na(DF))
```

但它只计算 NA 值(16)并告诉我它们位于哪个值位置。

但是,我也碰巧知道我的数据集中有 10 个值丢失,并且它们的格式不是 NA,它的“”,所以缺失值的总数应该是 26,我应该得到所有这些值的值位置。

我尝试使用类似的东西:

sum(is.na(DF, na.strings=c("NA"," ","")))

但我得到了这个错误: is.na(DF, na.strings = c("NA", " ", "")) 中的错误: 2 个参数传递给“is.na”,它需要 1 个

关于在这里做什么的任何想法也将是令人惊奇的。

谢谢你!

I have 2 problems:

Problem 1: I am trying to work out how to identify any common missing value formats like NA, " ", "".

I thought is.na would identify all of these formats, can someone point me in the right direction for what I need to do here?

Problem 2: I need to count the NA, " " and "" values and list the position for all of them.

Ive tried:

```{r, echo=TRUE,include=TRUE}
sum(is.na(DF))
which(is.na(DF))
```

but it only counts the NA values (16) and tells me which value position they are in.

However, I also happen to know there are 10 values in my dataset that are missing and their format isnt NA, its " ", so the total for missing values should be 26 and I should get the value position for all of them.

I tried using something like:

sum(is.na(DF, na.strings=c("NA"," ","")))

But I got this error:
Error in is.na(DF, na.strings = c("NA", " ", "")) :
2 arguments passed to 'is.na' which requires 1

Any ideas on what to do here would be amazing as well.

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

少女的英雄梦 2025-01-26 06:11:30

is.na仅检测Na值,而不是“” nor “”。您可以使用gsub将和“”转换为“”,然后使用is.na

v = c(NA, "", " ", "A")
gsub("^$|^ $", NA, v)
# [1] NA  NA  NA  "A"

sum(is.na(gsub("^$|^ $", NA, v)))
# [1] 3

which(is.na(gsub("^$|^ $", NA, v)))
# [1] 1 2 3

说明:说明: ^$捕获空字符串(^定义字符串的开头和$结束)。 ^ $捕获一个带有一个空间的字符串(具有相同目的相同的锚定),|是或运算符。

is.na only detects NA values, not " " nor "". You can convert " " and "" to NA using gsub, and then use is.na:

v = c(NA, "", " ", "A")
gsub("^$|^ 
quot;, NA, v)
# [1] NA  NA  NA  "A"

sum(is.na(gsub("^$|^ 
quot;, NA, v)))
# [1] 3

which(is.na(gsub("^$|^ 
quot;, NA, v)))
# [1] 1 2 3

Explanation: ^$ captures empty string (^ defines the beginning of the string and $ the end). ^ $ captures a string with one space (with the same anchors having the same purpose), and | is the OR operator.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文