如何使报告的步骤成为pytorch光线的记录频率的倍数,而不是日志记录频率减去1?
[警告! 内部的pedantry]
我正在使用pytorch Lightning包装我的Pytorch型号,但是由于我是pedantic的,我发现记录器在我所要求的频率上报告步骤的方式令人沮丧, 减1 :
- 当我设置
log_every_n_steps = 100
intrainer
in 时,我的tensorboard输出显示了我的指标99、199、299等。为什么不在100、200、300处? - 当我设置
check_val_every_n_epoch = 30
在trainer
中时,我的控制台输出显示进度栏升至Epoch 29,然后做validate
,留下一条跟踪在时期29、59、89之后报告指标的控制台输出。这样:
Epoch 29: 100%|█████████████████████████████| 449/449 [00:26<00:00, 17.01it/s, loss=0.642, v_num=logs]
[validation] {'roc_auc': 0.663, 'bacc': 0.662, 'f1': 0.568, 'loss': 0.633}
Epoch 59: 100%|█████████████████████████████| 449/449 [00:26<00:00, 16.94it/s, loss=0.626, v_num=logs]
[validation] {'roc_auc': 0.665, 'bacc': 0.652, 'f1': 0.548, 'loss': 0.630}
Epoch 89: 100%|█████████████████████████████| 449/449 [00:27<00:00, 16.29it/s, loss=0.624, v_num=logs]
[validation] {'roc_auc': 0.665, 'bacc': 0.652, 'f1': 0.548, 'loss': 0.627}
我做错了吗?我应该简单地提交PL来解决这个问题吗?
[Warning!! pedantry inside]
I'm using PyTorch Lightning to wrap my PyTorch model, but because I'm pedantic, I am finding the logger to be frustrating in the way it reports the steps at the frequency I've asked for, minus 1:
- When I set
log_every_n_steps=100
inTrainer
, my Tensorboard output shows my metrics at step 99, 199, 299, etc. Why not at 100, 200, 300? - When I set
check_val_every_n_epoch=30
inTrainer
, my console output shows progress bar goes up to epoch 29, then does avalidate
, leaving a trail of console outputs that report metrics after epochs 29, 59, 89, etc. Like this:
Epoch 29: 100%|█████████████████████████████| 449/449 [00:26<00:00, 17.01it/s, loss=0.642, v_num=logs]
[validation] {'roc_auc': 0.663, 'bacc': 0.662, 'f1': 0.568, 'loss': 0.633}
Epoch 59: 100%|█████████████████████████████| 449/449 [00:26<00:00, 16.94it/s, loss=0.626, v_num=logs]
[validation] {'roc_auc': 0.665, 'bacc': 0.652, 'f1': 0.548, 'loss': 0.630}
Epoch 89: 100%|█████████████████████████████| 449/449 [00:27<00:00, 16.29it/s, loss=0.624, v_num=logs]
[validation] {'roc_auc': 0.665, 'bacc': 0.652, 'f1': 0.548, 'loss': 0.627}
Am I doing something wrong? Should I simply submit a PR to PL to fix this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您没有做错任何事情。 Python使用基于零的索引,因此时期计数也从零开始。如果要更改所显示的内容的行为,则需要覆盖默认的
tqdmprogressbar
和修改on_train_epoch_start
以显示偏置值。您可以通过以下方式实现这一目标:在最后一行的代码中注意+1。这将抵消进度栏中显示的时期。然后将您的自定义栏传递给您的教练:
Finaly:
对于第一个时代,这将显示:
而不是默认值
You are not doing anything wrong. Python uses zero-based indexing so epoch counting starts at zero as well. If you want to change the behavior of what is being displayed you will need to override the default
TQDMProgressBar
and modifyon_train_epoch_start
to display an offsetted value. You can achieve this by:Notice the +1 in the last line of code. This will offset the epoch displayed in the progress bar. Then pass your custom bar to your trainer:
Finaly:
For the first epoch this will display:
and not the default