当前位置：文江博客话题详情

通过 QR 分解、SVD（和 Cholesky 分解？）计算投影/帽子矩阵

发布于 2025-01-01 00:12:51 字数 971 浏览 1 评论 0 原文

我正在尝试在 R 中计算任意 N x J 矩阵 S 的投影矩阵 P：

P = S (S'S) ^ -1 S'

我一直在尝试使用以下函数执行此操作：

P <- function(S){
  output <- S %*% solve(t(S) %*% S) %*% t(S)
  return(output)
}

但是当我使用这个我得到的错误看起来像这样：

# Error in solve.default(t(S) %*% S, t(S), tol = 1e-07) : 
#  system is computationally singular: reciprocal condition number = 2.26005e-28

我认为这是数值下溢和/或不稳定性的结果，正如许多地方讨论的那样r-help 和此处，但我使用 SVD 或 QR 的经验还不够分解以解决问题，或者将现有代码付诸实践。我也尝试过建议的代码，即将解决方案编写为系统：

output <- S %*% solve (t(S) %*% S, t(S), tol=1e-7)

但它仍然不起作用。任何建议将不胜感激。

我非常确定我的矩阵应该是可逆的并且不具有任何共线性，如果只是因为我尝试使用正交虚拟变量矩阵来测试它，但它仍然不起作用。

另外，我想将其应用于相当大的矩阵，所以我正在寻找一个简洁的通用解决方案。

原文

I'm trying to calculate in R a projection matrix P of an arbitrary N x J matrix S:

P = S (S'S) ^ -1 S'

I've been trying to perform this with the following function:

P <- function(S){
  output <- S %*% solve(t(S) %*% S) %*% t(S)
  return(output)
}

But when I use this I get errors that look like this:

# Error in solve.default(t(S) %*% S, t(S), tol = 1e-07) : 
#  system is computationally singular: reciprocal condition number = 2.26005e-28

I think that this is a result of numerical underflow and/or instability as discussed in numerous places like r-help and here, but I'm not experienced enough using SVD or QR decomposition to fix the problem, or else put this existing code into action. I've also tried the suggested code, which is to write solve as a system:

output <- S %*% solve (t(S) %*% S, t(S), tol=1e-7)

But still it doesn't work. Any suggestions would be appreciated.

I'm pretty sure that my matrix should be invertible and does not have any co-linearities, if only because I have tried testing this with a matrix of orthogonal dummy variables, and it still doesn't work.

Also, I'd like to apply this to fairly large matrices, so I'm looking for a neat general solution.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烟若柳尘 2025-01-08 00:12:51

虽然OP已经一年多没有活跃了，但我仍然决定发布一个答案。我会使用 X 而不是 S，因为在统计中，我们经常需要线性回归上下文中的投影矩阵，其中 X 是模型矩阵， y 是响应向量，而 H = X(X'X)^{-1}X' 是帽子/投影矩阵，因此 Hy 给出预测值。

这个答案假设了普通最小二乘的背景。对于加权最小二乘，请参阅从 QR 分解获取帽子矩阵以进行加权最小二乘回归。

概述

求解基于一般方阵的 LU 分解。对于 X'X（应由 R 中的 crossprod(X) 而不是 t(X) %*% X 计算，请阅读 ?crossprod 了解更多）这是对称的，我们可以使用基于 Choleksy 分解的 chol2inv 。

然而，三角分解不如 QR 分解稳定。这并不难理解。如果X具有条件数kappa，则X'X将具有条件数kappa ^ 2。这可能会导致很大的数值困难。您收到的错误消息：

# system is computationally singular: reciprocal condition number = 2.26005e-28

只是告诉我们这一点。 kappa ^ 2 约为 e-28，远小于 e-16 左右的机器精度。在容差 tol = .Machine$double.eps 的情况下，X'X 将被视为秩不足，因此 LU 和 Cholesky 分解将失败。

通常，在这种情况下我们会改用 SVD 或 QR，但旋转 Cholesky 分解是另一种选择。

SVD是最稳定的方法，但成本太高；
QR 非常稳定，计算成本适中，在实践中得到广泛使用；
旋转 Cholesky 速度很快，稳定性还可以接受。对于大型矩阵，这是首选。

下面我将解释这三种方法。

使用 QR 分解

请注意，投影矩阵是排列无关的，即，我们是否执行带有或不带有旋转的 QR 分解并不重要。

在 R 中，qr.default 可以调用 LINPACK 例程 DQRDC 进行非透视 QR 分解，并调用 LAPACK 例程 DGEQP3 进行块透视 QR 分解。让我们生成一个玩具矩阵并测试这两个选项：

set.seed(0); X <- matrix(rnorm(50), 10, 5)
qr_linpack <- qr.default(X)
qr_lapack <- qr.default(X, LAPACK = TRUE)

str(qr_linpack)
# List of 4
# $ qr   : num [1:10, 1:5] -3.79 -0.0861 0.3509 0.3357 0.1094 ...
# $ rank : int 5
# $ qraux: num [1:5] 1.33 1.37 1.03 1.01 1.15
# $ pivot: int [1:5] 1 2 3 4 5
# - attr(*, "class")= chr "qr"

str(qr_lapack)
# List of 4
# $ qr   : num [1:10, 1:5] -3.79 -0.0646 0.2632 0.2518 0.0821 ...
# $ rank : int 5
# $ qraux: num [1:5] 1.33 1.21 1.56 1.36 1.09
# $ pivot: int [1:5] 1 5 2 4 3
# - attr(*, "useLAPACK")= logi TRUE
# - attr(*, "class")= chr "qr"

请注意，两个对象的 $pivot 是不同的。

现在，我们定义一个包装函数来计算 QQ'：

f <- function (QR) {
  ## thin Q-factor
  Q <- qr.qy(QR, diag(1, nrow = nrow(QR$qr), ncol = QR$rank))
  ## QQ'
  tcrossprod(Q)
  }

我们将看到 qr_linpack 和 qr_lapack 给出相同的投影矩阵：

H1 <- f(qr_linpack)
H2 <- f(qr_lapack)

mean(abs(H1 - H2))
# [1] 9.530571e-17

使用奇异值分解

在 R 中，svd 计算奇异值分解。我们仍然使用上面的示例矩阵 X：

SVD <- svd(X)

str(SVD)
# List of 3
# $ d: num [1:5] 4.321 3.667 2.158 1.904 0.876
# $ u: num [1:10, 1:5] -0.4108 -0.0646 -0.2643 -0.1734 0.1007 ...
# $ v: num [1:5, 1:5] -0.766 0.164 0.176 0.383 -0.457 ...

H3 <- tcrossprod(SVD$u)

mean(abs(H1 - H3))
# [1] 1.311668e-16

我们再次得到相同的投影矩阵。

使用枢轴 Cholesky 分解

为了演示，我们仍然使用上面的示例 X。

## pivoted Chol for `X'X`; we want lower triangular factor `L = R'`:
## we also suppress possible rank-deficient warnings (no harm at all!)
L <- t(suppressWarnings(chol(crossprod(X), pivot = TRUE)))

str(L)
# num [1:5, 1:5] 3.79 0.552 -0.82 -1.179 -0.182 ...
# - attr(*, "pivot")= int [1:5] 1 5 2 4 3
# - attr(*, "rank")= int 5

## compute `Q'`
r <- attr(L, "rank")
piv <- attr(L, "pivot")
Qt <- forwardsolve(L, t(X[, piv]), r)

## P = QQ'
H4 <- crossprod(Qt)

## compare
mean(abs(H1 - H4))
# [1] 6.983997e-17

我们再次得到相同的投影矩阵。

Although OP has not been active for more than a year, I still decide to post an answer. I would use X instead of S, as in statistics, we often want projection matrix in linear regression context, where X is the model matrix, y is the response vector, while H = X(X'X)^{-1}X' is hat / projection matrix so that Hy gives predictive values.

This answer assumes the context of ordinary least squares. For weighted least squares, see Get hat matrix from QR decomposition for weighted least square regression.

An overview

solve is based on LU factorization of a general square matrix. For X'X (should be computed by crossprod(X) rather than t(X) %*% X in R, read ?crossprod for more) which is symmetric, we can use chol2inv which is based on Choleksy factorization.

However, triangular factorization is less stable than QR factorization. This is not hard to understand. If X has conditional number kappa, X'X will have conditional number kappa ^ 2. This can cause big numerical difficulty. The error message you get:

# system is computationally singular: reciprocal condition number = 2.26005e-28

is just telling this. kappa ^ 2 is about e-28, much much smaller than machine precision at around e-16. With tolerance tol = .Machine$double.eps, X'X will be seen as rank deficient, thus LU and Cholesky factorization will break down.

Generally, we switch to SVD or QR in this situation, but pivoted Cholesky factorization is another choice.

SVD is the most stable method, but too expensive;
QR is satisfyingly stable, at moderate computational costs, and is commonly used in practice;
Pivoted Cholesky is fast, with acceptable stability. For large matrix this one is preferred.

In the following, I will explain all three methods.

Using QR factorization

Note that the projection matrix is permutation independent, i.e., it does not matter whether we perform QR factorization with or without pivoting.

In R, qr.default can call LINPACK routine DQRDC for non-pivoted QR factorization, and LAPACK routine DGEQP3 for block pivoted QR factorization. Let's generate a toy matrix and test both options:

set.seed(0); X <- matrix(rnorm(50), 10, 5)
qr_linpack <- qr.default(X)
qr_lapack <- qr.default(X, LAPACK = TRUE)

str(qr_linpack)
# List of 4
# $ qr   : num [1:10, 1:5] -3.79 -0.0861 0.3509 0.3357 0.1094 ...
# $ rank : int 5
# $ qraux: num [1:5] 1.33 1.37 1.03 1.01 1.15
# $ pivot: int [1:5] 1 2 3 4 5
# - attr(*, "class")= chr "qr"

str(qr_lapack)
# List of 4
# $ qr   : num [1:10, 1:5] -3.79 -0.0646 0.2632 0.2518 0.0821 ...
# $ rank : int 5
# $ qraux: num [1:5] 1.33 1.21 1.56 1.36 1.09
# $ pivot: int [1:5] 1 5 2 4 3
# - attr(*, "useLAPACK")= logi TRUE
# - attr(*, "class")= chr "qr"

Note the $pivot is different for two objects.

Now, we define a wrapper function to compute QQ':

f <- function (QR) {
  ## thin Q-factor
  Q <- qr.qy(QR, diag(1, nrow = nrow(QR$qr), ncol = QR$rank))
  ## QQ'
  tcrossprod(Q)
  }

We will see that qr_linpack and qr_lapack give the same projection matrix:

H1 <- f(qr_linpack)
H2 <- f(qr_lapack)

mean(abs(H1 - H2))
# [1] 9.530571e-17

Using singular value decomposition

In R, svd computes singular value decomposition. We still use the above example matrix X:

SVD <- svd(X)

str(SVD)
# List of 3
# $ d: num [1:5] 4.321 3.667 2.158 1.904 0.876
# $ u: num [1:10, 1:5] -0.4108 -0.0646 -0.2643 -0.1734 0.1007 ...
# $ v: num [1:5, 1:5] -0.766 0.164 0.176 0.383 -0.457 ...

H3 <- tcrossprod(SVD$u)

mean(abs(H1 - H3))
# [1] 1.311668e-16

Again, we get the same projection matrix.

Using Pivoted Cholesky factorization

For demonstration, we still use the example X above.

## pivoted Chol for `X'X`; we want lower triangular factor `L = R'`:
## we also suppress possible rank-deficient warnings (no harm at all!)
L <- t(suppressWarnings(chol(crossprod(X), pivot = TRUE)))

str(L)
# num [1:5, 1:5] 3.79 0.552 -0.82 -1.179 -0.182 ...
# - attr(*, "pivot")= int [1:5] 1 5 2 4 3
# - attr(*, "rank")= int 5

## compute `Q'`
r <- attr(L, "rank")
piv <- attr(L, "pivot")
Qt <- forwardsolve(L, t(X[, piv]), r)

## P = QQ'
H4 <- crossprod(Qt)

## compare
mean(abs(H1 - H4))
# [1] 6.983997e-17

Again, we get the same projection matrix.

回复收藏 0 原文

~没有更多了~