C++ Ax=b 线性代数系统的内存高效解决方案
我正在使用 Boost UBlas 的数字库绑定来求解简单的线性系统。 以下工作正常,除了它仅限于处理相对矩阵 A(mxm) 小“m”。
实际上,我有一个更大的矩阵,尺寸为 m= 10^6(最多 10^7)。
是否存在有效使用内存来求解 Ax=b 的现有 C++ 方法?
#include<boost/numeric/ublas/matrix.hpp>
#include<boost/numeric/ublas/io.hpp>
#include<boost/numeric/bindings/traits/ublas_matrix.hpp>
#include<boost/numeric/bindings/lapack/gesv.hpp>
#include <boost/numeric/bindings/traits/ublas_vector2.hpp>
// compileable with this command
//g++ -I/home/foolb/.boost/include/boost-1_38 -I/home/foolb/.boostnumbind/include/boost-numeric-bindings solve_Axb_byhand.cc -o solve_Axb_byhand -llapack
namespace ublas = boost::numeric::ublas;
namespace lapack= boost::numeric::bindings::lapack;
int main()
{
ublas::matrix<float,ublas::column_major> A(3,3);
ublas::vector<float> b(3);
for(unsigned i=0;i < A.size1();i++)
for(unsigned j =0;j < A.size2();j++)
{
std::cout << "enter element "<<i << j << std::endl;
std::cin >> A(i,j);
}
std::cout << A << std::endl;
b(0) = 21; b(1) = 1; b(2) = 17;
lapack::gesv(A,b);
std::cout << b << std::endl;
return 0;
}
I am using Numeric Library Bindings for Boost UBlas to solve a simple linear system.
The following works fine, except it is limited to handling matrices A(m x m) for relatively
small 'm'.
In practice I have a much larger matrix with dimension m= 10^6 (up to 10^7).
Is there existing C++ approach for solving Ax=b that uses memory efficiently.
#include<boost/numeric/ublas/matrix.hpp>
#include<boost/numeric/ublas/io.hpp>
#include<boost/numeric/bindings/traits/ublas_matrix.hpp>
#include<boost/numeric/bindings/lapack/gesv.hpp>
#include <boost/numeric/bindings/traits/ublas_vector2.hpp>
// compileable with this command
//g++ -I/home/foolb/.boost/include/boost-1_38 -I/home/foolb/.boostnumbind/include/boost-numeric-bindings solve_Axb_byhand.cc -o solve_Axb_byhand -llapack
namespace ublas = boost::numeric::ublas;
namespace lapack= boost::numeric::bindings::lapack;
int main()
{
ublas::matrix<float,ublas::column_major> A(3,3);
ublas::vector<float> b(3);
for(unsigned i=0;i < A.size1();i++)
for(unsigned j =0;j < A.size2();j++)
{
std::cout << "enter element "<<i << j << std::endl;
std::cin >> A(i,j);
}
std::cout << A << std::endl;
b(0) = 21; b(1) = 1; b(2) = 17;
lapack::gesv(A,b);
std::cout << b << std::endl;
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
简短回答:不要使用 Boost 的
LAPACK
绑定,它们是为密集矩阵设计的,不是稀疏矩阵,请使用
UMFPACK
代替。长答案:
UMFPACK
是当 A 较大且稀疏时求解 Ax=b 的最佳库之一。下面是生成简单的
A
和 <代码>b并求解
Ax = b
。函数
generate_sparse_matrix_problem
创建矩阵A
和右侧
b
。 该矩阵首先以三元组形式构建。 这向量 Ti、Tj 和 Tx 充分描述了 A。三元组形式很容易创建,但是
高效的稀疏矩阵方法需要压缩稀疏列格式。 转换
使用
umfpack_di_triplet_to_col
执行。使用
umfpack_di_symbolic
执行符号分解。 稀疏的A
的 LU 分解通过umfpack_di_numeric
执行。下三角和上三角求解通过 umfpack_di_solve 执行。
当 n 为 500,000 时,在我的机器上,整个程序运行大约需要一秒钟。
Valgrind 报告分配了 369,239,649 字节(略高于 352 MB)。
请注意此页面讨论了 Boost 对三元组稀疏矩阵(坐标)
和压缩格式。 如果您愿意,您可以编写例程来转换这些 boost 对象
简单数组
UMFPACK
需要作为输入。Short answer: Don't use Boost's
LAPACK
bindings, these were designed for dense matrices,not sparse matrices, use
UMFPACK
instead.Long answer:
UMFPACK
is one of the best libraries for solving Ax=b when A is large and sparse.Below is sample code (based on
umfpack_simple.c
) that generates a simpleA
andb
and solves
Ax = b
.The function
generate_sparse_matrix_problem
creates the matrixA
and theright-hand side
b
. The matrix is first constructed in triplet form. Thevectors Ti, Tj, and Tx fully describe A. Triplet form is easy to create but
efficient sparse matrix methods require Compressed Sparse Column format. Conversion
is performed with
umfpack_di_triplet_to_col
.A symbolic factorization is performed with
umfpack_di_symbolic
. A sparseLU decomposition of
A
is performed withumfpack_di_numeric
.The lower and upper triangular solves are performed with
umfpack_di_solve
.With
n
as 500,000, on my machine, the entire program takes about a second to run.Valgrind reports that 369,239,649 bytes (just a little over 352 MB) were allocated.
Note this page discusses Boost's support for sparse matrices in Triplet (Coordinate)
and Compressed format. If you like, you can write routines to convert these boost objects
to the simple arrays
UMFPACK
requires as input.我假设你的矩阵是稠密的。 如果它很稀疏,您可以找到许多专门的算法,如 DeusAduro 和 达菲莫。
如果您没有(足够大的)集群可供使用,您需要考虑核外算法。 ScaLAPACK 有一些核外求解器作为其原型包,参见文档此处 和谷歌了解更多详情。 在网络上搜索“out-of-core LU /(矩阵)求解器/包”将为您提供大量其他算法和工具的链接。 我不是这些方面的专家。
然而,对于这个问题,大多数人会使用集群。 您几乎可以在任何集群上找到的软件包是 ScaLAPACK。 此外,典型集群上通常还有许多其他软件包,因此您可以挑选适合您问题的软件包(示例 此处和此处 )。
在开始编码之前,您可能想快速检查解决问题需要多长时间。 典型的求解器大约需要 O(3*N^3) 次触发器(N 是矩阵的维度)。 如果 N = 100000,那么您将看到 3000000 Gflops。 假设您的内存求解器每个核心的处理速度为 10 Gflops/s,则单个核心的运行时间为 3 1/2 天。 由于算法可扩展性良好,增加内核数量应该可以接近线性地减少时间。 最重要的是 I/O。
I assume that your matrix is dense. If it is sparse, you can find numerous specialised algorithms as already mentioned by DeusAduro and duffymo.
If you don't have a (large enough) cluster at your disposal, you want to look at out-of-core algorithms. ScaLAPACK has a few out-of-core solvers as part of its prototype package, see the documentation here and Google for more details. Searching the web for "out-of-core LU / (matrix) solvers / packages" will give you links to a wealth of further algorithms and tools. I am not an expert on those.
For this problem, most people would use a cluster, however. The package you will find on almost any cluster is ScaLAPACK, again. In addition, there are usually numerous other packages on the typical cluster, so you can pick and choose what suits your problem (examples here and here).
Before you start coding, you probably want to quickly check how long it will take to solve your problem. A typical solver takes about O(3*N^3) flops (N is dimension of matrix). If N = 100000, you are hence looking at 3000000 Gflops. Assuming that your in-memory solver does 10 Gflops/s per core, you are looking at 3 1/2 days on a single core. As the algorithms scale well, increasing the number of cores should reduce the time close to linearly. On top of that comes the I/O.
假设你的巨大矩阵是稀疏的,我希望它们是那样的大小,看看 PARDISO项目是一个稀疏线性求解器,如果您想处理像您所说的那样大的矩阵,这就是您所需要的。 仅允许有效存储非零值,并且比求解相同的稠密矩阵系统要快得多。
Assuming your huge matrices are sparse, which I hope they are at that size, have a look at the PARDISO project which is a sparse linear solver, this is what you'll need if you want to handle matrices as big as you said. Allows efficient storage of only non-zero values, and is much faster than solving the same system of dense matrices.
不确定 C++ 实现,但如果内存是一个问题,您可以执行以下操作,具体取决于您正在处理的矩阵类型:
Not sure about C++ implementations, but there are several things you can do if memory is an issue depending on the type of matrix you're dealing with:
查看用于解决线性代数的免费软件列表问题,由 Jack Dongarra 和 Hatem Ltaief 编译。
我认为对于您正在考虑的问题大小,您可能需要迭代算法。 如果您不想以稀疏格式存储矩阵 A,则可以使用无矩阵实现。 迭代算法通常不需要访问矩阵 A 的各个条目,它们只需要计算矩阵向量乘积 Av (有时 A^T v,即转置矩阵与向量的乘积)。 因此,如果该库设计良好,那么向它传递一个知道如何计算矩阵向量乘积的类就足够了。
Have a look at the list of freely available software for the solution of linear algebra problems, compiled by Jack Dongarra and Hatem Ltaief.
I think that for the problem size you're looking at, you probably need an iterative algorithm. If you don't want to store the matrix A in a sparse format, you can use a matrix-free implementation. Iterative algorithms typically do not need to access individual entries of the matrix A, they only need to compute matrix-vector products Av (and sometimes A^T v, the product of the transposed matrix with the vector). So if the library is well-designed, it should be enough if you pass it a class that knows how to do matrix-vector products.
正如已接受的答案所示,有 UMFPACK。 但如果您使用 BOOST,您仍然可以使用 BOOST 中的紧凑矩阵并使用 UMFPACK 来求解系统。 有一个绑定可以轻松实现:
http://mathema.tician.de/software /boost-numeric-bindings
它的日期大约有两年,但它只是一个绑定(以及其他一些绑定)。
请参阅相关问题:
UMFPACK 和 BOOST 的 uBLAS 稀疏矩阵
As the accepted answer suggests there is UMFPACK. But if you are using BOOST you can still use the compact matrices in BOOST and use UMFPACK to solve the system. There is a binding which makes it easy:
http://mathema.tician.de/software/boost-numeric-bindings
Its about two years dated, but its just a binding (along with a few others).
see related question:
UMFPACK and BOOST's uBLAS Sparse Matrix