如何解决多重共线性?

发布于 2025-01-18 03:10:50 字数 1075 浏览 2 评论 0原文

我构建了一个线性模型并尝试计算变量的 VIF,但出现以下错误:

 vif(lm_model3101)

Error in vif.default(lm_model3101) : 
  there are aliased coefficients in the model

为了检查哪些数值变量是相关的,我计算了所使用的数值变量的相关性,并且任何变量之间没有完美或近乎完美的相关性:

cor(multi)


mydata..CRU.Index. mydata..GDP.per.capita. mydata.price_per_unit mydata.price_discount mydata..AC..Volume.
mydata..CRU.Index.             1.000000000             0.006036169             0.1646463          -0.097077238        -0.006590327
mydata..GDP.per.capita.        0.006036169             1.000000000             0.1526220           0.008135387        -0.137733119
mydata.price_per_unit          0.164646319             0.152621974             1.0000000          -0.100344865        -0.310770525
mydata.price_discount         -0.097077238             0.008135387            -0.1003449           1.000000000         0.339961760
mydata..AC..Volume.           -0.006590327            -0.137733119            -0.3107705           0.339961760         1.000000000

可能是什么问题?有什么帮助或建议吗?其余的解释变量是阶乘的,因此它们不能相关

I constructed a linear model and tried to calculate the VIF of the variables but I get the following error:

 vif(lm_model3101)

Error in vif.default(lm_model3101) : 
  there are aliased coefficients in the model

To check which numeric variables are corelated, i calculated the correlation of the used numeric variables and there is no perfect or nearly perfect correlation between any variables:

cor(multi)


mydata..CRU.Index. mydata..GDP.per.capita. mydata.price_per_unit mydata.price_discount mydata..AC..Volume.
mydata..CRU.Index.             1.000000000             0.006036169             0.1646463          -0.097077238        -0.006590327
mydata..GDP.per.capita.        0.006036169             1.000000000             0.1526220           0.008135387        -0.137733119
mydata.price_per_unit          0.164646319             0.152621974             1.0000000          -0.100344865        -0.310770525
mydata.price_discount         -0.097077238             0.008135387            -0.1003449           1.000000000         0.339961760
mydata..AC..Volume.           -0.006590327            -0.137733119            -0.3107705           0.339961760         1.000000000

What could the problem be? any help or suggestions? The rest of our explanatory variables are factorial so they can not be correlated

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

成熟的代价 2025-01-25 03:10:50

具有别名系数并不一定意味着两个预测因子完全相关。这意味着它们是线性依赖的,即至少一个术语是其他术语的线性组合。它们可能是因素或连续变量。要找到它们,请使用别名函数。例如:

y <- runif(10)
x1 <- runif(10)
x2 <- runif(10)
x3 <- x1 + x2

alias(y~x1+x2+x3)

Model :
y ~ x1 + x2 + x3

Complete :
   (Intercept) x1 x2
x3 0           1  1 

这将x3标识为x1x2的总和

Having aliased coefficients doesn't necessarily mean two predictors are perfectly correlated. It means that they are linearly dependent, that is at least one terms is a linear combination of the others. They could be factors or continuous variables. To find them, use the alias function. For example:

y <- runif(10)
x1 <- runif(10)
x2 <- runif(10)
x3 <- x1 + x2

alias(y~x1+x2+x3)

Model :
y ~ x1 + x2 + x3

Complete :
   (Intercept) x1 x2
x3 0           1  1 

This identifies x3 as being the sum of x1 and x2

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文