cor 仅显示 NA 或 1 的相关性 - 为什么?
我在包含所有数值的 data.frame
上运行 cor()
,得到的结果是:
price exprice...
price 1 NA
exprice NA 1
...
所以它是 1
或 NA
对于结果表中的每个值。为什么显示的是 NA
而不是有效的相关性?
I'm running cor()
on a data.frame
with all numeric values and I'm getting this as the result:
price exprice...
price 1 NA
exprice NA 1
...
So it's either 1
or NA
for each value in the resulting table. Why are the NA
s showing up instead of valid correlations?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
就我而言,我使用了两个以上的变量,这对我来说效果更好:
但是:
In my case I was using more than two variables, and this worked for me better:
However:
NA 实际上可能有两个原因。一是您的数据中存在 NA。另一种情况是由于其中一个值是恒定的。这导致标准差等于零,因此 cor 函数返回 NA。
The NA can actually be due to 2 reasons. One is that there is a NA in your data. Another one is due to there being one of the values being constant. This results in standard deviation being equal to zero and hence the cor function returns NA.
使用
use
参数告诉关联忽略 NA,例如:Tell the correlation to ignore the NAs with
use
argument, e.g.:1
是因为一切都与自身完全相关,NA
是因为变量中存在NA
。您必须指定 R 在存在缺失值时如何计算相关性,因为默认情况下仅计算具有完整信息的系数。
您可以使用
cor
的use
参数更改此行为,请参阅?cor
了解详细信息。The
1
s are because everything is perfectly correlated with itself, and theNA
s are because there areNA
s in your variables.You will have to specify how you want R to compute the correlation when there are missing values, because the default is to only compute a coefficient with complete information.
You can change this behavior with the
use
argument tocor
, see?cor
for details.如果存在方差为零的属性(所有元素都相等),也会出现 NA;例如,参见:
返回:
NAs also appear if there are attributes with zero variance (with all elements equal); see for instance:
which returns:
非常简单且正确的答案
告诉关联忽略带有 use 参数的 NA,例如:
very simple and correct answer
Tell the correlation to ignore the NAs with use argument, e.g.: