PyTorch 中的字典支持
PyTorch 是否支持类似字典的对象,通过它们我们可以反向传播梯度,就像 PyTorch 中的张量一样?
我的目标是计算大矩阵中少数(1%)元素的梯度。但如果我使用 PyTorch 的标准张量来存储矩阵,我需要将整个矩阵保留在我的 GPU 中,这会由于训练期间可用的 GPU 内存有限而导致问题。所以我在想是否可以将矩阵存储为字典,仅索引矩阵的相关元素,并仅计算梯度并反向传播那些选择的元素。
到目前为止,我已尝试仅使用张量,但由于上述原因,它导致了内存问题。因此,我在 PyTorch 中广泛搜索了替代选项,例如 dicts,但在 Google 上找不到任何此类信息。
Does PyTorch support dict-like objects, through which we can backpropagate gradients, like Tensors in PyTorch?
My goal is to compute gradients with respect to a few (1%) elements of a large matrix. But if I use PyTorch's standard Tensors to store the matrix, I need to keep the whole matrix in my GPU, which causes problems due to limited GPU memory available during training. So I was thinking whether I could store the matrix as a dict instead, indexing only the relevant elements of the matrix, and computing gradients and backpropagating w.r.t those select elements only.
So far, I have tried using Tensors only, but it's causing memory issues for the above reasons. So I searched extensively for alternate options like dicts in PyTorch but couldn't find any such information on Google.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
听起来您希望参数成为
torch.sparse
张量。此接口允许您拥有大部分为零的张量,在已知位置只有少数非零元素。稀疏张量应该可以让您显着减少模型的内存占用。
请注意,该接口仍处于“建设中”:稀疏张量并不支持所有操作。然而,它正在不断改进。
It sounds like you want your parameter to be a
torch.sparse
tensor.This interface allows you to have tensors that are mostly zeros, with only a few non-zero elements in known locations. Sparse tensors should allow you to significantly reduce the memory footprint of your model.
Note that this interface is still "under construction": not all operations are supported for sparse tensors. However, it is being constantly improving.