多元线性回归

发布于 2024-08-02 17:46:58 字数 196 浏览 10 评论 0原文

我正在尝试使用 GLSMultipleLinearRegression (来自 apache commons-math 包)进行多元线性回归。它期望一个协方差矩阵作为输入——我不知道如何计算它们。我有 1 个因变量数组和 3 个自变量数组。
知道如何计算协方差矩阵吗?

注意:我的 3 个自变量各有 200 个项目,

谢谢
巴拉尼

I am trying to use GLSMultipleLinearRegression (from apache commons-math package) for multiple linear regression. It is expecting a covariance matrix as input -- I am not sure how to compute them. I have one array of dependent variables and 3 arrays of independent variables.
Any idea how to compute the covariance matrix?

Note: I have 200 items for each of the 3 independent variables

Thanks
Bharani

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

乖乖哒 2024-08-09 17:46:58

如果您不知道误差之间的协方差,则可以采用迭代方法。您将首先使用普通最小二乘法,计算误差以及误差之间的协方差。然后,您可以使用计算出的协方差矩阵应用 GLS 并重新估计协方差矩阵。您将继续使用 GLS 和新协方差矩阵进行迭代,直到收敛。 这里有一个链接(.pdf 警告)此方法的示例以及加权和迭代加权最小二乘法的相关讨论,其中您在 GLS 中假设的误差之间没有相关性。

If you do not know the covariance between the errors you can take an iterative approach. You would first use Ordinary Least Squares, calculating the errors, and the covariances between the errors. You would then apply the GLS using the calculated covariance matrix and re-estimate the covariance matrix. You would continue iteration using GLS with the new covariance matrix until you have a convergence. Here is a link (.pdf warning) to an example of this method as well as a related discussion of Weighted and Iteratively Weighted Least Squares where you don't have a correlation between the errors as assumed in the GLS.

初见你 2024-08-09 17:46:58

刚刚遇到 Flanagan 库可以做到这一点盒子的。还收到来自公共用户列表的邮件,目前公共数学 确实不支持FGLS - 协方差矩阵的自动估计

-Bharani

Just came across Flanagan library that does this out of the box. Also got a mail from the commons user list that commons math at the moment does not support FGLS - automatic estimation of covariance matrix

-Bharani

陌生 2024-08-09 17:46:58

如果您不知道误差之间的协方差,我会使用普通最小二乘法 (OLS) 而不是广义最小二乘法 (GLS)。这相当于将单位矩阵作为协方差矩阵。该库似乎在 < 中实现了 OLS代码>OLSMultipleLinearRegression

If you have no idea of the covariance between the errors, I would use Ordinary Least Squares (OLS) instead of Generalized Least Squares (GLS). This amounts to taking the identity matrix as covariance matrix. The library appears to implement OLS in OLSMultipleLinearRegression .

残疾 2024-08-09 17:46:58

您是否尝试过直接从数据创建协方差矩阵

new Covariance().computeCovarianceMatrix(data)

根据评论中的信息,我们知道有 3 个自变量、1 个因变量和 200 个样本。这意味着您将拥有一个 4 列 200 行的数据数组。最终结果将如下所示(明确输入所有内容以尝试解释我的意思):

double [] data = new double [4][];
data[0] = new double[]{y[0], x[0][0], x[1][0], x[2][0]};
data[1] = new double[]{y[1], x[0][1], x[1][1], x[2][1]};
data[2] = new double[]{y[2], x[0][2], x[1][2], x[2][2]};
// ... etc.
data[199] = new double[]{y[199], x[0][199], x[1][199], x[2][199]};
Covariance covariance = new Covariance().computeCovarianceMatrix(data);
double [][] omega = covariance.getCovarianceMatrix().getData();

然后, 当您进行实际回归时,您将得到协方差矩阵:

MultipleLinearRegression regression = new GLSMultipleLinearRegression();
// Assumes you put your independent variables in x and dependent in y
// Also assumes that you made your covariance matrix as shown above 
regression.addData(y, x, omega); // we do need covariance

Have you tried creating a Covariance matrix directly from your data?

new Covariance().computeCovarianceMatrix(data)

Using the information in the comment, we know that there are 3 independent, 1 dependent variables and 200 samples. That implies that you will have a data array with 4 columns and 200 rows. The end result will look something like this (typing everything out explicitly in order to try to explain what I mean):

double [] data = new double [4][];
data[0] = new double[]{y[0], x[0][0], x[1][0], x[2][0]};
data[1] = new double[]{y[1], x[0][1], x[1][1], x[2][1]};
data[2] = new double[]{y[2], x[0][2], x[1][2], x[2][2]};
// ... etc.
data[199] = new double[]{y[199], x[0][199], x[1][199], x[2][199]};
Covariance covariance = new Covariance().computeCovarianceMatrix(data);
double [][] omega = covariance.getCovarianceMatrix().getData();

Then, when you're doing your actual regression, you have your covariance matrix:

MultipleLinearRegression regression = new GLSMultipleLinearRegression();
// Assumes you put your independent variables in x and dependent in y
// Also assumes that you made your covariance matrix as shown above 
regression.addData(y, x, omega); // we do need covariance
煮酒 2024-08-09 17:46:58

@马克拉文

您首先会使用普通最少
平方,计算误差,以及
误差之间的协方差

我有点困惑。由于我们只有一个响应变量,因此残差应该是一维变量。那么误差的协方差矩阵适合在哪里呢?

@Mark Lavin

You would first use Ordinary Least
Squares, calculating the errors, and
the covariances between the errors

Im a bit confused.. Since we have only one response variable, the residual errors should be 1 dimensional variable. Then where does a covariance matrix of errors fit in?

墨落画卷 2024-08-09 17:46:58

您需要将 3 个随机独立变量组织为矩阵中的列向量:x1、x2、x3 (N),其中每一行都是一个观测值 (M)。这将是一个 MxN 矩阵。

然后,将此数据矩阵插入 Apache 提供的协方差例程中,例如:
Covariance.computeCovarianceMatrix(RealMatrix 矩阵)。

You need to organize the 3 random independent variates as column vectors in a matrix: x1, x2, x3 (N) where each row is a observation (M). This will be an MxN matrix.

You then plug this data matrix into a covariance routine provided by Apache such as:
Covariance.computeCovarianceMatrix(RealMatrix matrix).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文