如何使用是识别na的na，' “” ETC

发布于 2025-01-19 06:11:30 字数 635 浏览 3 评论 0原文

我有 2 个问题：

问题 1：我正在尝试找出如何识别任何常见的缺失值格式，例如 NA、" "、""。

我认为 is.na 会识别所有这些格式，有人可以为我指明我需要在这里做什么的正确方向吗？

问题 2：我需要计算 NA、" " 和 "" 值并列出所有这些值的位置。

我尝试过：

```{r, echo=TRUE,include=TRUE}
sum(is.na(DF))
which(is.na(DF))
```

但它只计算 NA 值（16）并告诉我它们位于哪个值位置。

但是，我也碰巧知道我的数据集中有 10 个值丢失，并且它们的格式不是 NA，它的“”，所以缺失值的总数应该是 26，我应该得到所有这些值的值位置。

我尝试使用类似的东西：

sum(is.na(DF, na.strings=c("NA"," ","")))

但我得到了这个错误： is.na(DF, na.strings = c("NA", " ", "")) 中的错误： 2 个参数传递给“is.na”，它需要 1 个

关于在这里做什么的任何想法也将是令人惊奇的。

谢谢你！

原文

I have 2 problems:

Problem 1: I am trying to work out how to identify any common missing value formats like NA, " ", "".

I thought is.na would identify all of these formats, can someone point me in the right direction for what I need to do here?

Problem 2: I need to count the NA, " " and "" values and list the position for all of them.

Ive tried:

```{r, echo=TRUE,include=TRUE}
sum(is.na(DF))
which(is.na(DF))
```

but it only counts the NA values (16) and tells me which value position they are in.

However, I also happen to know there are 10 values in my dataset that are missing and their format isnt NA, its " ", so the total for missing values should be 26 and I should get the value position for all of them.

I tried using something like:

sum(is.na(DF, na.strings=c("NA"," ","")))

But I got this error:
Error in is.na(DF, na.strings = c("NA", " ", "")) :
2 arguments passed to 'is.na' which requires 1

Any ideas on what to do here would be amazing as well.

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

少女的英雄梦 2025-01-26 06:11:30

is.na仅检测Na值，而不是“” nor “”。您可以使用gsub将和“”转换为“”，然后使用is.na：

v = c(NA, "", " ", "A")
gsub("^$|^ $", NA, v)
# [1] NA  NA  NA  "A"

sum(is.na(gsub("^$|^ $", NA, v)))
# [1] 3

which(is.na(gsub("^$|^ $", NA, v)))
# [1] 1 2 3

说明：说明： ^$捕获空字符串（^定义字符串的开头和$结束）。 ^ $捕获一个带有一个空间的字符串（具有相同目的相同的锚定），|是或运算符。

is.na only detects NA values, not " " nor "". You can convert " " and "" to NA using gsub, and then use is.na:

v = c(NA, "", " ", "A")
gsub("^$|^ quot;, NA, v)
# [1] NA  NA  NA  "A"

sum(is.na(gsub("^$|^ quot;, NA, v)))
# [1] 3

which(is.na(gsub("^$|^ quot;, NA, v)))
# [1] 1 2 3

Explanation: ^$ captures empty string (^ defines the beginning of the string and $ the end). ^ $ captures a string with one space (with the same anchors having the same purpose), and | is the OR operator.

回复收藏 0 原文

~没有更多了~