在 R 中运行回归循环的最佳方法是什么？

发布于 2024-08-29 01:40:31 字数 365 浏览 4 评论 0原文

假设我有可索引的数据源 X 和 Y，例如矩阵。我想运行一组独立回归并存储结果。我最初的方法是

results = matrix(nrow=nrow(X), ncol=(2))
for(i in 1:ncol(X)) {
        matrix[i,] = coefficients(lm(Y[i,] ~ X[i,])

}

但是，循环很糟糕，所以我可以用 lapply 来做到这一点，因为

out <- lapply(1:nrow(X), function(i) { coefficients(lm(Y[i,] ~ X[i,])) } )

有更好的方法来做到这一点吗？

原文

Assume that I have sources of data X and Y that are indexable, say matrices. And I want to run a set of independent regressions and store the result. My initial approach would be

results = matrix(nrow=nrow(X), ncol=(2))
for(i in 1:ncol(X)) {
        matrix[i,] = coefficients(lm(Y[i,] ~ X[i,])

}

But, loops are bad, so I could do it with lapply as

out <- lapply(1:nrow(X), function(i) { coefficients(lm(Y[i,] ~ X[i,])) } )

Is there a better way to do this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奶气 2024-09-05 01:40:31

你在这里肯定过度优化了。与模型拟合过程相比，循环的开销可以忽略不计，因此简单的答案是 - 使用您认为最容易理解的任何方式。我会选择 for 循环，但 lapply 也很好。

回复收藏 0 原文

乄_柒ぐ汐 2024-09-05 01:40:31

我用 plyr 做这类事情，但我同意这不是一个处理效率问题，而是一个你可以轻松阅读和编写的问题。

回复收藏 0 原文

冷︶言冷语的世界 2024-09-05 01:40:31

如果您只想执行简单的多元线性回归，那么我建议不要使用 lm()。有 lsfit()，但我不确定它是否会提供很大的加速（我从未进行过正式比较）。相反，我建议使用 qr() 和 qrcoef() 执行 (X'X)^{-1}X'y。这将允许您执行多元多元线性回归；也就是说，将响应变量视为矩阵而不是向量，并对每行观测值应用相同的回归。

Z # design matrix
Y # matrix of observations (each row is a vector of observations)
## Estimation via multivariate multiple linear regression                    
beta <- qr.coef(qr(Z), Y)
## Fitted values                                                             
Yhat <- Z %*% beta
## Residuals                                                                 
u <- Y - Yhat

在您的示例中，每个观察向量是否有不同的设计矩阵？如果是这样，您也许可以修改 Z 以仍然适应这种情况。

If you just want to perform straightforward multiple linear regression, then I would recommend not using lm(). There is lsfit(), but I'm not sure it would offer than much of a speed up (I have never performed a formal comparison). Instead I would recommend performing the (X'X)^{-1}X'y using qr() and qrcoef(). This will allow you to perform multivariate multiple linear regression; that is, treating the response variable as a matrix instead of a vector and applying the same regression to each row of observations.

Z # design matrix
Y # matrix of observations (each row is a vector of observations)
## Estimation via multivariate multiple linear regression                    
beta <- qr.coef(qr(Z), Y)
## Fitted values                                                             
Yhat <- Z %*% beta
## Residuals                                                                 
u <- Y - Yhat

In your example, is there a different design matrix per vector of observations? If so, you may be able to modify Z in order to still accommodate this.

回复收藏 0 原文

~没有更多了~