一旦将批量设置为&gt,网络就会停止学习。 1
我今天开始从 Keras 切换到 Pytorch,并尝试了一些简单的前馈网络。它应该学习平方运算,即 f(x) = x^2。然而,如果我将批量大小设置为 1,我的网络只能合理地学习。任何其他批量大小都会产生非常差的结果。我还尝试了 1 到 0.0001 之间的不同学习率,看看这是否能以某种方式修复它,并且还测试了对网络的一些更改,但无济于事。谁能告诉我我做错了什么,即为什么一旦我将批量大小设置为大于 1 的任何值,我的网络就无法学习?下面找到一个最小的工作示例。感谢您的帮助!
import numpy as np
from random import randint
import random
import time
from multiprocessing import Pool
import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from torchvision import datasets, transforms
class SquareDataset(Dataset):
def __init__(self, num_samples):
super(Dataset, self).__init__()
self.num_samples = num_samples
self.train = [None] * num_samples
self.target = [None] * num_samples
for i in range(0, num_samples):
self.train[i] = random.random() * randint(1, 10)
self.target[i] = self.train[i] ** 2
def __len__(self):
return self.num_samples
def __getitem__(self, index):
return self.train[index], self.target[index]
def trainNetwork(epochs=50):
data_train = SquareDataset(num_samples=1000)
data_train_loader = DataLoader(data_train, batch_size=1, shuffle=False)
model = nn.Sequential(nn.Linear(1, 32),
nn.ReLU(),
nn.Linear(32, 32),
nn.ReLU(),
nn.Linear(32, 1))
# Define the loss
criterion = nn.MSELoss()
# Optimizers require the parameters to optimize and a learning rate
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
for e in range(epochs):
running_loss = 0
for number, labels in data_train_loader:
optimizer.zero_grad()
number = number.view(number.size(0), -1)
output = model(number.float())
loss = criterion(output, labels.float())
loss.backward()
optimizer.step()
running_loss += loss.item()
else:
print(f"Training loss: {running_loss/len(data_train_loader)}")
# some test outputs
sample = torch.tensor([0.2])
out = model(sample.float())
print("Out:")
print(out.item())
sample = torch.tensor([1])
out = model(sample.float())
print("Out:")
print(out.item())
trainNetwork()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在线
loss = criterion(output,labels.float())
首先张量具有shape(batch_size,1)
而labels
具有Shape(batch_size,)
。因此,当batch_size> 1
广播发生,这导致了错误的目标,类似于的情况类似。要克服问题重写损失线,但具有相等的形状,例如:On line
loss = criterion(output, labels.float())
first tensor has shape(batch_size, 1)
whilelabels
has shape(batch_size, )
. Hence whenbatch_size > 1
broadcasting occurs and this lead to wrong objective, case similar to this. To overcome issue rewrite loss line but with equal shapes, like: