matlab cell2mat( ... ) 函数与具有一堆稀疏矩阵的元胞数组意外溢出内存
我在使用 Matlab 和 cell2mat() 函数时得到了关于内存的奇怪行为...
我想做的是:
cell_array_outer = cell(1,N)
parfor k = 1:N
cell_array_inner = cell(1,M);
for i = 1:M
A = do_some_math_and_return_a_sparse_matrix( );
cell_array_inner{i} = sparse(A); % do sparse() again just to be paranoid
end
cell_array_outer{k} = sparse( cell2mat( cell_array_inner ) );
end
Giant_Matrix = cell2mat( cell_array_outer ); % DOH!
但是唉,“DOH”指示的行使用了一些荒谬的内存量,超过了应该结束的内存量您将稀疏矩阵的大小相加......就像它制作了一个太大的中间结构一样。
以下工作正常,但双索引不适用于 par-for 所以我只能使用一个核心:
cell_array_giant = cell(M,N)
for k = 1:N % cannot use parfor with {i,k} dual indices!
for i = 1:M
A = do_some_math_and_return_a_sparse_matrix( );
cell_array_giant{i,k} = sparse(A); % do sparse() again just to be paranoid
end
end
cell_array_giant = reshape( cell_array_giant, 1, M * N )
Giant_Matrix = sparse( cell2mat( cell_array_giant ) ); % Ok... but no parfor
我怀疑在后一种情况下,每个单元格元素的大小更易于管理......就像 20,000 x1 稀疏矩阵,但在前者中,这些“外部”元素现在是 20,000 x 5,000,并且不知何故不适合 Matlab 将它们作为临时变量的位置,并且尽管它们极其稀疏,但内存使用仍会失控。
关于内存使用和上述内容有什么要遵循的规则吗?或者如何更改我的 parfor 使用以便它在第二种情况下有效? “parfor”是一种新功能,因此网络上关于它的内容比其他核心功能要少……它比运行 8 个 matlab 副本要高效得多!
I get strange behavior with respect to memory with Matlab and the cell2mat() function...
what I would like to do is:
cell_array_outer = cell(1,N)
parfor k = 1:N
cell_array_inner = cell(1,M);
for i = 1:M
A = do_some_math_and_return_a_sparse_matrix( );
cell_array_inner{i} = sparse(A); % do sparse() again just to be paranoid
end
cell_array_outer{k} = sparse( cell2mat( cell_array_inner ) );
end
Giant_Matrix = cell2mat( cell_array_outer ); % DOH!
But alas the line indicated by "DOH" uses some absurd amount of memory, more than what should end up if you add up the sizes of the sparse matrices... like its making an intermediate structure that's too big.
The following works fine, but double-indexing doesn't work with par-for so I can only use use one core:
cell_array_giant = cell(M,N)
for k = 1:N % cannot use parfor with {i,k} dual indices!
for i = 1:M
A = do_some_math_and_return_a_sparse_matrix( );
cell_array_giant{i,k} = sparse(A); % do sparse() again just to be paranoid
end
end
cell_array_giant = reshape( cell_array_giant, 1, M * N )
Giant_Matrix = sparse( cell2mat( cell_array_giant ) ); % Ok... but no parfor
My suspicion is that in the latter case, each cell element is much more manageable in size... like a 20,000x1 sparse matrix, but in the former those "outer" elements are now 20,000 x 5,000 and somehow not fitting where Matlab would like to put them as temporary variables, and the memory use gets out of control despite their extreme sparsity.
Any rules to follow regarding memory use and the above? Or how to change my parfor use so it jives in the 2nd case? "parfor" is kind of new so there's less stuff on the web about it than other core features... its much more efficient than running 8 copies of matlab!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
为了预测临时内存的使用,我们必须更多地了解 Matlab 的内部工作——但我不知道。
对于您的第二个解决方案,我认为,如果您在内循环中执行此操作,则可以使用 parfor (至少我没有收到 m-lint 警告)。如有必要,请转置您的问题,使 M>N,因为您通常希望 parfor 进行大量快速计算,而不是很少的长计算,这样,如果操作不能被 8 整除(或者无论您运行多少个内核)。
另外,是否有可能在k循环内构造巨大的稀疏矩阵?这完全避免了重塑。当然,您只能
parfor
M 循环,否则,巨大的数组将传递给所有工作人员,并且会发生很多悲伤。To predict temporary memory use, we'd have to know more about the Matlab internals work - which I don't.
For your second solution, you can use
parfor
if you do it inside the inner loop, I think (I don't get an m-lint warning, at least). If necessary, transpose your problem so that M>N, because you usually wantparfor
to do lots of quick calculations, instead of very few long ones, so that you get less of an overhang if the number of operations isn't divisible by 8 (or however many cores you may run).Also, would it be possible to construct the giant sparse matrix inside the k-loop? This avoids the reshape altogether. Of course, you'd only be able to
parfor
the M-loop, since otherwise, the giant array would be passed to all the worker, and lots of sadness would ensue.