当前位置：文江博客话题详情

如何获得数据的工作corrplot？

发布于 2025-01-29 07:42:11 字数 396 浏览 2 评论 0原文

我正在尝试获取数据变量的Corrplot，这些变量是二进制，连续和分类变量的组合。但是，当我运行此代码时，它会不断给我错误。当我加载数据框时，称为DF2的错误是：Corrplot中的错误（DF2）：矩阵不在[-1，1]！中。我该如何解决？

当我计算相关性时，我也会得到某些变量，即使它们是数字和整数值 1 。

附加了我的数据变量的示例，其中hh_code是用于标识的列： 2

如何我在r中获得了我的数据变量之间的相关性？谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

谁的新欢旧爱 2025-02-05 07:42:12

如果您有一个表格，而其他列则是数字的，则可以使用函数ggally :: ggpairs来获取有关这些变量之间关联的概述：

library(GGally)
#> Loading required package: ggplot2
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
data <- ggplot2::mpg[c(1,3,4,7,8)]
ggpairs(data)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

^{由 preprex软件包（v2.0.0）}

如果您需要更严格一些，则可以使用统计测试来获得列/变量之间的显着关系（如果其假设成立）：

协方差x	曲目y	测试	r函数
数字	人数	相关性	`cor.test（x，y，method =“ pearson”）`
二进制	数字	t测试	`t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.test（x，y）`
ordinal（有序因素）	序列（有序因子）	Spearman相关性	`cor.test（as.numeric（x），as.numeric（y），method =“ spearman”）`
exporial（许多级别）	数字	ANOVA	`anova anova （lm（y〜x））`

If you have a table in which some columns are numeric and others are categorical, you can use the function GGally::ggpairs to get an overview about the associations between these variables:

library(GGally)
#> Loading required package: ggplot2
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
data <- ggplot2::mpg[c(1,3,4,7,8)]
ggpairs(data)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

^{Created on 2022-05-17 by the reprex package (v2.0.0)}

If you need a little bit more rigour, you can use statistical tests to get significant relationships between columns / variables (if their assumptions hold):

covariate x	outcome y	test	R function
numeric	numeric	Person correlation	`cor.test(x,y, method = "pearson")`
binary	numeric	t test	`t.test(x, y)`
ordinal (ordered factor)	ordinal (ordered factor)	Spearman correlation	`cor.test(as.numeric(x), as.numeric(y), method="spearman")`
categorial (many levels)	numeric	ANOVA	`anova(lm(y ~ x))`