批量尺寸的可持续发展目标> 1?
我正在IBM录制“带有Pytorch的Deep NNS”课程,并遇到了实验室示例,其中可持续发展目标用于优化器,而批处理大小为DataLoader中。
如果我正确理解,SGD将在每个步骤中只有1个训练示例执行梯度下降,那么SGD将如何与每批培训示例进行相互作用?
例如,如果批处理大小= 20,SGD优化器会在每批中执行20 GD步骤吗?如果是这种情况,那么这是否意味着无论我为数据加载器设置了什么批次大小,SGD优化器都会在一个时期内执行(训练示例)GD步骤?
Layers = [2, 50, 3]
model = Net(Layers)
learning_rate = 0.10
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
train_loader = DataLoader(dataset=data_set, batch_size=20)
criterion = nn.CrossEntropyLoss()
LOSS = train(data_set, model, criterion, train_loader, optimizer, epochs=100)
def train(data_set, model, criterion, train_loader, optimizer, epochs=100):
LOSS = []
ACC = []
for epoch in range(epochs):
for x, y in train_loader:
print(x, y)
optimizer.zero_grad()
yhat = model(x)
loss = criterion(yhat, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
LOSS.append(loss.item())
ACC.append(accuracy(model, data_set))
...
I'm taking the "Deep NNs with PyTorch" course by IBM and I encountered lab examples where SDG is used for optimizer while batch size is >1 in DataLoader.
If I understand correctly, SGD would perform gradient descent with only 1 training example in each step, so it this case how would the SGD interact with each batch of training example?
For example, if batch size = 20, would the SGD optimizer perform 20 GD steps in each batch? If this is the case, then does that mean no matter what batch size I set for DataLoader, the SGD optimizer would just perform (# of training example) GD steps in one epoch?
Layers = [2, 50, 3]
model = Net(Layers)
learning_rate = 0.10
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
train_loader = DataLoader(dataset=data_set, batch_size=20)
criterion = nn.CrossEntropyLoss()
LOSS = train(data_set, model, criterion, train_loader, optimizer, epochs=100)
def train(data_set, model, criterion, train_loader, optimizer, epochs=100):
LOSS = []
ACC = []
for epoch in range(epochs):
for x, y in train_loader:
print(x, y)
optimizer.zero_grad()
yhat = model(x)
loss = criterion(yhat, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
LOSS.append(loss.item())
ACC.append(accuracy(model, data_set))
...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
否。批次尺寸= 20表示,它将处理所有20个样本,然后获得标量损失。基于此,它将回到错误。那是GD的一步。
它被称为 minibatch sgd ,而不是像在SGD中这样的1输入,而是考虑20,然后其他所有内容保持不变。
No. Batch size = 20 means, it would process all the 20 samples and then get the scalar loss. Based on that it would backpropagate the error. And that is one step of GD.
This is known as minibatch SGD, instead of taking 1 input like in SGD, it considers 20 and then everything else stays the same.