_pickle.picklingerror,在培训网络中,num_workers正面
我正在使用Pytorch数据集和数据加载程序训练一个大数据集,并且出现了此问题。我删除了大部分代码来提出这个问题,并且问题存在,这些代码可以独立运行:
import numpy as np
from torch.utils.data import DataLoader, Dataset
class PGDataset(Dataset):
def __init__(self, X, y):
self.data_x = X
self.data_y = y
def __len__(self):
return len(self.data_x)
def __getitem__(self, item):
return self.data_x[item], self.data_y[item]
def train(dataloader):
epochs = 5
for epoch in range(epochs):
for i, (data_x, data_y) in enumerate(dataloader):
pass
print('Train ended successfully.')
if __name__ == '__main__':
X, y = np.random.rand(200, 4), np.random.randint(0, 5, 200, dtype='int64')
train_set = PGDataset(X, y)
dataloader = DataLoader(train_set, batch_size=50, shuffle=True, num_workers=4)
train(dataloader)
这些代码对我来说似乎很稳定,我不知道造成此问题的原因。
我正在使用python == 3.9.7,numpy == 1.21.5,pytorch == 1.10.1。 顺便说一句,我刚刚将Pycharm更新为2022.1。
我已经整天尝试了这个,并在许多网站上进行了搜索,我发现:
- 如果我将数据级加载器中的num_worker设置为0,则可以。但是我不需要一个大数据集为0。
- 在Pycharm调试模式下,也没有错误。
为什么会发生这种情况?
I was training a big dataset, using pytorch Dataset and DataLoader, and this problem appeared. I deleted much of my code to ask this question, and the problem exists, these code can run independently:
import numpy as np
from torch.utils.data import DataLoader, Dataset
class PGDataset(Dataset):
def __init__(self, X, y):
self.data_x = X
self.data_y = y
def __len__(self):
return len(self.data_x)
def __getitem__(self, item):
return self.data_x[item], self.data_y[item]
def train(dataloader):
epochs = 5
for epoch in range(epochs):
for i, (data_x, data_y) in enumerate(dataloader):
pass
print('Train ended successfully.')
if __name__ == '__main__':
X, y = np.random.rand(200, 4), np.random.randint(0, 5, 200, dtype='int64')
train_set = PGDataset(X, y)
dataloader = DataLoader(train_set, batch_size=50, shuffle=True, num_workers=4)
train(dataloader)
These code seems very standerd for me, and I do not know the reason for this problem.
I am using python==3.9.7, numpy==1.21.5, pytorch==1.10.1.
By the way, I just updated pycharm to 2022.1.
I have tryed this for all day, and searched in many websites, I found:
- If I set num_workers in dataloader to 0, it's ok. But I need it not to be 0 for a big dataset.
- In pycharm debug mode, there's also no mistakes.
Why this happen?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论