Pytorch Python 分布式多重处理:收集/连接不同长度/大小的张量数组
如果多个 GPU 级别上有不同长度的张量数组,则默认的 all_gather 方法将不起作用,因为它要求长度相同。
例如,如果您:
if gpu == 0:
q = torch.tensor([1.5, 2.3], device=torch.device(gpu))
else:
q = torch.tensor([5.3], device=torch.device(gpu))
如果我需要按如下方式收集这两个张量数组:
all_q = [torch.tensor([1.5, 2.3], torch.tensor[5.3])
默认 torch.all_gather
不起作用,因为长度 2, 1
不同。
If you have tensor arrays of different lengths across several gpu ranks, the default all_gather
method does not work as it requires the lengths to be same.
For example, if you have:
if gpu == 0:
q = torch.tensor([1.5, 2.3], device=torch.device(gpu))
else:
q = torch.tensor([5.3], device=torch.device(gpu))
If I need to gather these two tensor arrays as follows:
all_q = [torch.tensor([1.5, 2.3], torch.tensor[5.3])
the default torch.all_gather
does not work as the lengths, 2, 1
are different.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
由于无法直接使用内置方法进行收集,因此我们需要按照以下步骤编写自定义函数:
dist.all_gather
获取所有数组的大小。dist.all_gather
获取所有填充数组。下面的函数执行此操作:
一旦我们能够执行上述操作,我们就可以轻松地使用
torch.cat
进一步连接成一个如果需要数组:改编自:github
As it is not directly possible to gather using built in methods, we need to write custom function with the following steps:
dist.all_gather
to get sizes of all arrays.dist.all_gather
to get all padded arrays.The below function does this:
Once, we are able to do the above, we can then easily use
torch.cat
to further concatenate into a single array if needed:Adapted from: github
这是 @omsrisagar 解决方案的扩展,支持任意维张量(不仅仅是一维张量)。
请注意,这要求所有张量具有相同的维数,并且除第一个维度外的所有维度都相等。
Here is an extension of @omsrisagar's solution that supports tensors of any number of dimensions (not only 1-dimensional tensors).
Note that this requires that all the tensors have the same number of dimensions and have all their dimensions equal, except for the first dimension.
自 PyTorch 1.6.0 以来,填充数据不必要引入了接受可变形状的
all_to_all
:如果张量具有不同的维数:
Padding data has been unnecessary, since PyTorch 1.6.0 introduced
all_to_all
which accepts variable shapes:If tensors have different numbers of dimensions: