求解超定约束系统
我有 n
个实数变量(不知道,也不在乎),我们称它们为 X[n]
。 我也有 m >>>它们之间有 n 种关系,我们称它们为 R[m] ,其形式为:
X[i] = alpha*X[j], alpha 是非零正实数,i
和 j
不同,但 (i, j)
对不一定是唯一的(即具有不同 alpha 因子的相同变量之间可以存在两种关系)
什么我想做的是找到一组 alpha 参数,以最小二乘的方式解决超定系统。理想的解决方案是最小化每个方程参数与其所选值之间的差的平方和,但我对以下近似感到满意:
如果我将 m 个方程转变为 n 个未知数的超定系统,则任何基于伪逆的数值求解器将为我提供明显的解决方案(全为零)。因此,我目前所做的就是添加另一个方程,x[0] = 1
(实际上任何常数都可以),并使用 Moore-Penrose 伪函数在最小二乘意义上求解生成的系统逆。虽然这试图最小化 (x[0] - 1)^2
的总和以及 x[i] - alpha*x[j]
的平方和,但我发现它是我的问题的一个很好且数值稳定的近似值。下面是一个示例:
a = 1
a = 2*b
b = 3*c
a = 5*c
在 Octave 中:
A = [
1 0 0;
1 -2 0;
0 1 -3;
1 0 -5;
]
B = [1; 0; 0; 0]
C = pinv(A) * B or better yet:
C = pinv(A)(:,1)
生成 a
、b
、c
的值:[0.99383; 0.51235; 0.19136]
这给了我以下(合理的)关系:
a = 1.9398*b
b = 2.6774*c
a = 5.1935*c
所以现在我需要在 C / C++ / Java 中实现这个,并且我有以下问题:
是否有更快的方法来解决我的问题,或者我是否走在正确的轨道上生成超定系统并计算伪逆?
我当前的解决方案需要奇异值分解和三个矩阵乘法,考虑到 m
可以是 5000 甚至 10000,这有点多了。是否有更快的方法来计算伪逆(实际上,我只需要考虑到矩阵的稀疏性(每行恰好包含两个非零值,其中一个始终为 1,另一个始终为负),它的第一列,而不是整个矩阵(假定除第一行外 B 为零) )
什么您会建议使用数学库吗?拉帕克还好吗?
我也愿意接受任何其他建议,只要它们在数值上稳定且渐近快速(假设 k*n^2
,其中 k
可以很大)。
I have n
real number variables (don't know, don't really care), let's call them X[n]
.
I also have m >> n
relationships between them let's call them R[m]
, of the form:
X[i] = alpha*X[j]
, alpha
is a nonzero positive real number, i
and j
are distinct but the (i, j)
pair is not necessarily unique (i.e. there can be two relationships between the same variables with a different alpha factor)
What I'm trying to do is find a set of alpha
parameters that solve the overdetermined system in some least squares sense. The ideal solution would be to minimize the squared sum of differences between each equation parameter and it's chosen value, but I'm satisfied with the following approximation:
If I turn the m equations into an overdetermined system of n unknowns, any pseudo-inverse based numeric solver will give me the obvious solution (all zeroes). So what I currently do is add another equation into the mix, x[0] = 1
(actually any constant will do) and solve the generated system in the least squares sense using the Moore-Penrose pseudo-inverse. While this tries to minimize the sum of (x[0] - 1)^2
and the square sum of x[i] - alpha*x[j]
, I find it a good and numerically stable approximation to my problem. Here is an example:
a = 1
a = 2*b
b = 3*c
a = 5*c
in Octave:
A = [
1 0 0;
1 -2 0;
0 1 -3;
1 0 -5;
]
B = [1; 0; 0; 0]
C = pinv(A) * B or better yet:
C = pinv(A)(:,1)
Which yields the values for a
, b
, c
: [0.99383; 0.51235; 0.19136]
Which gives me the following (reasonable) relationships:
a = 1.9398*b
b = 2.6774*c
a = 5.1935*c
So right now I need to implement this in C / C++ / Java, and I have the following questions:
Is there a faster method to solve my problem, or am I on the right track with generating the overdetermined system and computing the pseudo-inverse?
My current solution requires a singular value decomposition and three matrix multiplications, which is a little much considering m
can be 5000 or even 10000. Are there faster ways to compute the pseudo-inverse (actually, I only need the first column of it, not the entire matrix given that B is zero except for the first row) given the sparsity of the matrix (each row contains exactly two non-zero values, one of which is always one and the other is always negative)
What math libraries would you suggest to use for this? Is LAPACK ok?
I'm also open to any other suggestions, provided that they are numerically stable and asymptotically fast (let's say k*n^2
, where k
can be large).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的问题是不适定的。如果将问题视为 n 个变量的函数(差值的最小二乘),则该函数恰好具有一个全局最小超平面。
全局最小值将始终包含零,除非您将其中一个变量修复为非零,或者以其他方式减小函数域。
如果您想要的是解超平面的参数化,您可以从 Moore-Penrose 伪逆得到
http://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse 并检查有关获取所有解决方案的部分。
(请注意,我以技术上不正确的方式使用了“超平面”一词。我的意思是参数空间的“平坦”无界子集......一条线,一个平面,可以由一个或多个向量参数化的东西。由于某种原因,我找不到此类对象的一般名词)
Your problem is ill-posed. If you treat the problem as a function of n variables, (The least square of the difference), the function has exactly ONE global minimum hyperplane.
That global minimum will always contain zero unless you fix one of the variables to be nonzero, or reduce the function domain in some other way.
If what you want is a paramaterization of the solution hyperplane, you can get that from the Moore-Penrose Pseudo-Inverse
http://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse and check the section on obtaining all solutions.
(Please note i've used the word "hyperplane" in a technically incorrect manner. I mean a "flat" unbounded subset of your parameter space... A line, a plane, something that can be paramaterized by one or more vectors. For some reason i can't find the general noun for such objects)
SVD 方法在数值上非常稳定,但速度不是很快。如果您使用 SVD,那么 LAPACK 是一个很好用的库。如果这只是一次性计算,那么它可能足够快。
如果您需要更快的算法,您可能不得不牺牲稳定性。一种可能性是使用 QR 分解。您必须阅读本文才能了解详细信息,但部分推理如下。如果 AP = QR(其中 P 是置换矩阵,Q 是正交矩阵,R 是三角矩阵)是 A 的经济 QR 分解,则方程 AX = B 变为 QRP^{-1} X = B解为 X = PR^{-1} Q^T B。以下 Octave 代码使用与代码中相同的 A 和 B 说明了这一点。
这样做的好处是,您可以通过稀疏 QR 分解来利用 A 的稀疏性。 Octave 帮助中有一些关于 qr 函数的解释,但它并没有立即对我起作用。
更快(但也更不稳定)的是使用正规方程:如果 AX = B 则 A^TAX = A^T B。矩阵 A^TA 是(希望)满秩的方阵,因此您可以使用任何线性方程求解器。倍频程代码:
同样,可以在这种方法中利用稀疏性。求解稀疏线性系统的方法和库有很多;一个流行的似乎是 UMFPACK。
稍后添加:我对这个领域了解不够,无法量化。关于这一点已经写了整本书。也许 QR 比 SVD 快 3 或 5 倍,而正规方程又快两倍。对数值稳定性的影响取决于矩阵 A。稀疏算法可以更快(例如 m 的因子),但它们的计算成本和数值稳定性在很大程度上取决于问题,有时无法很好地理解。
在您的用例中,我的建议是尝试使用 SVD 计算解决方案,看看需要多长时间,如果可以接受,那么就使用它(我猜对于 n=1000 和 m=10000 大约需要一分钟)。如果您想进一步研究它,也可以尝试 QR 和正规方程,看看它们的速度有多快以及有多准确;如果他们给出的解决方案与 SVD 大致相同,那么您可以非常确信它们对于您的目的来说足够准确。只有当这些都太慢并且您愿意投入一些时间时,才考虑稀疏算法。
The SVD approach is numerically very stable but not very fast. If you use SVD, then LAPACK is a good library to use. If it's just a one-off computation, then it's probably fast enough.
If you need a substantially faster algorithm, you might have to sacrifice stability. One possibility would be to use the QR factorization. You'll have to read up on this to see the details, but part of the reasoning goes as follows. If AP = QR (where P is a permutation matrix, Q is an orthogonal matrix, and R is a triangular matrix) is the economy QR-decomposition of A, then the equation AX = B becomes Q R P^{-1} X = B and the solution is X = P R^{-1} Q^T B. The following Octave code illustrates this using the same A and B as in your code.
The nice thing about this is that you can exploit the sparsity of A by doing a sparse QR decomposition. There is some explanation in the Octave help for the qr function but it did not work for me immediately.
Even faster (but also even less stable) is to use the normal equations: If A X = B then A^T A X = A^T B. The matrix A^T A is a square matrix of (hopefully) full rank, so you can use any solver for linear equations. Octave code:
Again, sparsity can be exploited in this approach. There are many methods and libraries for solving sparse linear systems; a popular one seems to be UMFPACK.
Added later: I don't know enough about this field to quantify. Whole books have been written on this. Perhaps QR is about a factor 3 or 5 faster SVD and normal equations twice as fast again. The effect on the numerical stability depends on your matrix A. Sparse algorithms can be much faster (say a factor of m), but their computational cost and numerical stability depend very much on the problem, in ways that are sometimes not well understood.
In your use case, my recommendation would be to try computing the solution with the SVD, see how long it takes, and if that is acceptable then just use that (I guess it would be about a minute for n=1000 and m=10000). If you want to study it further, try also QR and normal equations and see how much faster they are and how accurate; if they give approximately the same solution as SVD then you can be pretty confident they are accurate enough for your purposes. Only if these are all too slow and you are willing to sink some time into it, look at sparse algorithms.