大规模伪逆
我想计算一个巨大矩阵的 Moore–Penrose 伪逆 。理想情况下,我希望在具有 2300 万行和 1000 列的矩阵上进行此操作,但如果有必要,我可以通过仅运行实验的一部分来将行数减少到 400 万。
显然,将矩阵加载到内存中并在其上运行 SVD 是行不通的。 维基百科指向Krylov 子空间 方法并提及 Arnoldi , Lanczos, 共轭梯度,GMRES(广义最小值残差)、BiCGSTAB(双共轭梯度稳定)、QMR(准最小残差)、TFQMR(无转置 QMR)和 MINRES(最小残差)方法是最好的 Krylov 子空间方法之一。但我不知道从这里该去哪里。计算如此巨大的矩阵的伪逆是否可行?如果是,使用哪些算法或软件库?我有一个大型计算集群可用,因此欢迎并行方法。
这个答案指向 R包 biglm。那行得通吗?有人用过吗?我通常使用 Python 工作,但不介意使用其他语言和工具来完成此特定任务。
I would like to compute the Moore–Penrose pseudoinverse of an enormous matrix. Ideally, I would like to do it on a matrix that has 23 million rows and 1000 columns, but if necessary I can reduce the number of rows to 4 million by only running on one part of my experiment.
Obviously, loading the matrix in to memory and running SVD on it is not going to work. Wikipedia points to Krylov subspace methods and mention the Arnoldi, Lanczos, Conjugate gradient, GMRES (generalized minimum residual), BiCGSTAB (biconjugate gradient stabilized), QMR (quasi minimal residual), TFQMR (transpose-free QMR), and MINRES (minimal residual) methods as being among the best Krylov subspace methods. But I don't know where to go from here. Is computing the pseudoinverse of such a huge matrix even feasible? If so, using which algorithms or software libraries? I have a large computing cluster available, so parallel approaches are welcome.
This answer points to the R package biglm. Would that work? Has anyone used it? I normally work in Python, but don't mind using other languages and tools for this particular task.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用直接收敛到最小二乘解的块迭代算法可能比通过伪逆计算最小二乘解更好。请参阅 Charlie Byrne 的“应用迭代方法”。这些算法与 Krylov 子空间方法密切相关,但经过调整以便于计算。您可以通过查看他的另一本书的预印本的第 3 章来了解相关介绍。
You might be better off using a block iterative algorithm that converges directly to the least squares solution than computing the least squares solution through the pseudoinverse. See "Applied Iterative Methods" by Charlie Byrne. These algorithms are closely related to the Krylov subspace methods, but are tuned for easy computation. You can get an introduction by looking at chapter 3 of this preprint of another of his books.