永无止境的批次

发布于 2025-01-26 09:18:53 字数 1893 浏览 2 评论 0原文

简而言之:使用正面对(同一个人的两个样本)和一个负面对的互助信息进行训练。不要认为这里的“如何”很重要。

我需要有关如何在此处摆脱批处理的指针。应该是时代 - > 10批 - > epoch,但它无限地进行。有几个问题,但主要问题是输出是这样的:

Epoch 1/90:
Batch 1/10
Loss mean:  tensor(nan, grad_fn=<MeanBackward0>)
Batch 2/10
...
Batch 291/10
Loss mean:  tensor(-4.44432, grad_fn=<MeanBackward0>)

我确实意识到损失是射击的,但我想我首先需要照顾好这个。

主火车循环:

 model.train()
    for epoch in range(args.start_epoch, args.start_epoch + args.epochs):
        print("Epoch {}/{}:".format(epoch + 1, args.start_epoch + args.epochs))
        for i, (batch, speakers) in enumerate(train_loader):
            print("Batch {}/{}   ".format(i + 1, BATCHES_PER_EPOCH))
            optimizer.zero_grad()
            score_posp, score_negp, speakers_probs, speakers = model(torch.tensor(batch, device=device), torch.tensor(speakers, dtype=torch.long, device=device))
            loss = loss_fn(score_negp, score_posp, speakers, speakers_probs)
            print("Loss mean: ", loss.mean())
            loss.mean().backward()
            optimizer.step()
        save_checkpoint({
            'epoch': epoch + 1,
            'state_dict': model.state_dict(),
            'optimizer': optimizer.state_dict(),
        }, filename="./checkpoints/checkpoint_e{}.pth.tar".format(epoch))
    scheduler.step()

批处理:

def make_batch(items):
    samples = [item[0] for item in items]
    speakers = [item[1] for item in items]
    return np.array(samples), np.array(speakers)


    voices_loader = Loader(args.data)

    train_loader = DataLoader(dataset=voices_loader,
                              shuffle=True,
                              num_workers=2,
                              batch_size=BATCHES_PER_EPOCH,
                              collate_fn=make_batch)

Voxcelebloader用相同的扬声器片段和一个随机的扬声器加载三元组。有人可以帮我看看为什么它是无限的吗?

In short: training using the mutual information with a positive pair (two samples of same person) and one negative pair. Dont think the "how" is important here.

I need pointers on how to get out of the batching here. should be Epoch -> 10 batch ->epoch but it goes on infinitely. There are a few issues but the main one is that the output is like this:

Epoch 1/90:
Batch 1/10
Loss mean:  tensor(nan, grad_fn=<MeanBackward0>)
Batch 2/10
...
Batch 291/10
Loss mean:  tensor(-4.44432, grad_fn=<MeanBackward0>)

I do realize the loss is shot but I think I first need to take care of this.

Main train loop:

 model.train()
    for epoch in range(args.start_epoch, args.start_epoch + args.epochs):
        print("Epoch {}/{}:".format(epoch + 1, args.start_epoch + args.epochs))
        for i, (batch, speakers) in enumerate(train_loader):
            print("Batch {}/{}   ".format(i + 1, BATCHES_PER_EPOCH))
            optimizer.zero_grad()
            score_posp, score_negp, speakers_probs, speakers = model(torch.tensor(batch, device=device), torch.tensor(speakers, dtype=torch.long, device=device))
            loss = loss_fn(score_negp, score_posp, speakers, speakers_probs)
            print("Loss mean: ", loss.mean())
            loss.mean().backward()
            optimizer.step()
        save_checkpoint({
            'epoch': epoch + 1,
            'state_dict': model.state_dict(),
            'optimizer': optimizer.state_dict(),
        }, filename="./checkpoints/checkpoint_e{}.pth.tar".format(epoch))
    scheduler.step()

batching:

def make_batch(items):
    samples = [item[0] for item in items]
    speakers = [item[1] for item in items]
    return np.array(samples), np.array(speakers)


    voices_loader = Loader(args.data)

    train_loader = DataLoader(dataset=voices_loader,
                              shuffle=True,
                              num_workers=2,
                              batch_size=BATCHES_PER_EPOCH,
                              collate_fn=make_batch)

VoxCelebLoader loads triples with a pair of the same speaker fragments and one random not the same speaker. Can someone help me see why is it infinite?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

揽月 2025-02-02 09:18:53

它可能只是一个非常长的数据集,您可以通过:

print(len(voices_loader))

如果您不搞砸了,它就不会是无限的
加载程序
你的。
如果您想修剪,可以使用

for epoch in range(args.start_epoch, args.start_epoch + args.epochs):
    print("Epoch {}/{}:".format(epoch + 1, args.start_epoch + args.epochs))
    for i, (batch, speakers) in enumerate(train_loader):
        if i >= 1000:
            break
        *your code*

its probably just a really long dataset, you can check it by:

print(len(voices_loader))

it cant be physically infinite if you didn't mess in that
Loader
of yours.
if you want to trim it you can use

for epoch in range(args.start_epoch, args.start_epoch + args.epochs):
    print("Epoch {}/{}:".format(epoch + 1, args.start_epoch + args.epochs))
    for i, (batch, speakers) in enumerate(train_loader):
        if i >= 1000:
            break
        *your code*
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文