Scipy:稀疏矩阵乘法内存错误
我想在稀疏矩阵及其转置之间执行矩阵乘法(它们是大矩阵)。具体来说,我有:
C = csc_matrix(...)
Ct = csc_matrix.transpose(C)
L = Ct*C
和形状:
C.shape
(1791489, 28508141)
Ct.shape
(28508141, 1791489)
并且我收到以下错误:
Traceback (most recent call last):
File "C:\...\modularity.py", line 373, in <module>
L = Ct*C
File "C:\...\anaconda3\lib\site-packages\scipy\sparse\base.py", line 480, in __mul__
return self._mul_sparse_matrix(other)
File "C:\...\anaconda3\lib\site-packages\scipy\sparse\compressed.py", line 518, in _mul_sparse_matrix
indices = np.empty(nnz, dtype=idx_dtype)
MemoryError: Unable to allocate 1.11 TiB for an array with shape (152087117507,) and data type int64
我无法弄清楚为什么,为什么它尝试为这么大的数组分配内存?
更新:目前我正在尝试像这样分块进行乘法
chunksize=1000
numiter = Ct.shape[0]//chunksize
blocks=[]
for i in range(numiter):
A = Ct[i*chunksize:(i+1)*chunksize].dot(C)
blocks.append(A)
但我得到:
MemoryError: Unable to allocate 217. MiB for an array with shape (57012620,) and data type int32
I want to perform matrix multiplication between a sparse matrix and its transpose, (their are big matrices). Specifically, I have:
C = csc_matrix(...)
Ct = csc_matrix.transpose(C)
L = Ct*C
and shapes:
C.shape
(1791489, 28508141)
Ct.shape
(28508141, 1791489)
And I am getting the following error:
Traceback (most recent call last):
File "C:\...\modularity.py", line 373, in <module>
L = Ct*C
File "C:\...\anaconda3\lib\site-packages\scipy\sparse\base.py", line 480, in __mul__
return self._mul_sparse_matrix(other)
File "C:\...\anaconda3\lib\site-packages\scipy\sparse\compressed.py", line 518, in _mul_sparse_matrix
indices = np.empty(nnz, dtype=idx_dtype)
MemoryError: Unable to allocate 1.11 TiB for an array with shape (152087117507,) and data type int64
I cannot figure out why, why does it try to allocate memory for such a huge array ?
Update: Currently I am trying to do the multiplication in chunks like this
chunksize=1000
numiter = Ct.shape[0]//chunksize
blocks=[]
for i in range(numiter):
A = Ct[i*chunksize:(i+1)*chunksize].dot(C)
blocks.append(A)
But I get:
MemoryError: Unable to allocate 217. MiB for an array with shape (57012620,) and data type int32
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对于未来想要乘以巨大稀疏矩阵的观众,我使用PyTables解决了我的问题,并将乘法结果保存在块中。它仍然会创建一个大文件,但至少被压缩了。我使用的代码如下所示:
因此,如果您想访问最终矩阵的第二行,您只需:
For future viewers who want to multiply huge sparse matrices I solved my problem using PyTables and saved the result of the multiplication in chunks. Still it creates a big file but at least is compressed. The code I used goes like this:
So if for example you want access to the 2nd row of your final matrix you simply can: