更快的rcnn的培训损失要么成为nan或无限
我想在我策划和标记的自定义数据集上实现Pytorch更快的RCNN模块。实现细节看起来很简单,有一个演示显示了自定义数据集的培训和推断(一个人检测问题)。就像人检测数据集一样,只有一个类(人)和背景类别一样,我的个人数据集也只有一个类。因此,我认为无需对超参数进行任何更改。目前尚不清楚是什么导致培训损失成为NAN或无穷大。这发生在第一个时代。感谢有关此问题的任何建议。
编辑: 我发现我的面具有时是不合时宜的,这炸毁了损失,因此累积损失要么是无限的,要么是无限的。它可能并不重要,但事实上,pytorch检测COLAB笔记本示例提供的train_one_epoch函数带有Pennfudan数据集添加了一种数学条件,即损失进行数学时Sys.exit()-ED。
I want to implement Pytorch Faster-RCNN module on a custom dataset that I curated and labelled. The implementation detail looks straightforward, there was a demo that showed training and inference on a custom dataset (a person detection problem). Just like the person detection dataset, where there is only one class (person) along with background class, my personal dataset also has only one class. I therefore saw no need to make any changes to the hyper-parameters. It is unclear what is causing the training loss to either become Nan or infinity. This happens on the first epoch. Grateful for any suggestions on this issue.
Edit:
I found that my masks were sometimes out of bounds, which was blowing up the loss, so the cumulative loss was either going positive infinity or negative infinity. It probably would not have mattered but for the fact that train_one_epoch function provided by the PyTorch Detection Colab Notebook example with PennFudan dataset added a math condition that sys.exit() -ed when the loss was going math.isfinite().
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论