Pytorch分布式数据加载器
有什么推荐的方法使Pytorch DataLoader(torch.utils.data.dataloader
)在分布式环境,单个机器和多台机器中工作吗?可以没有distributeDataParallear
可以完成吗?
Any recommended ways to make PyTorch DataLoader (torch.utils.data.DataLoader
) work in distributed environment, single machine and multiple machines? Can it be done without DistributedDataParallel
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
也许您需要清楚您的问题。
distributeDataParallel
被缩写为ddp
,您需要在分布式环境中使用ddp
训练模型。这个问题似乎询问如何安排数据集加载过程进行分布式培训。首先,
data.dataloader
适用于DIST和非待命培训,通常无需在此上做某事。但是采样策略在这两种模式下有所不同,您需要为数据加载器指定一个采样器(
sampler
indata.dataloader
),采用torch.utils .data.distributed.distributedsampler
是最简单的方法。Maybe you need to make your question clear.
DistributedDataParallel
is abbreviated asDDP
, you need to train a model withDDP
in a distributed environment. This question seems to ask how to arrange the dataset loading process for distributed training.First of all,
data.Dataloader
is proper for both dist and non-dist training, usually, there is no need to do something on that.But the sampling strategy varies in this two modes, you need to specify a sampler for the dataloader(the
sampler
arg indata.Dataloader
), adoptingtorch.utils.data.distributed.DistributedSampler
is the simplest way.