如何使用 ktrain 包中的 learner.lr_plot 理解损失学习率(对数刻度)图?
我正在使用 ktrain 包对文本进行分类。我的实验显示为:
lr_find 和 lr_plot 是 ktrain 中的函数。它们可用于突出显示最佳学习率,在图中显示为红点。
我不明白如何理解这个图:
- 如何将对数刻度转换为正常的线性刻度?
- 为什么最好的刻度是红点?
I am using ktrain package to classify text. My experiment is shown as:
lr_find and lr_plot are functions in ktrain. They can be used to highlight the best learning rate, which is shown as the red dot in the plot.
I do not understand how to understand this plot:
- How to transfer log scale to the normal linear one?
- Why the best scale is the red dot?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如 lr_find 方法中的文本所述,您可以直观地检查该图并在发散之前损失下降的范围内选择学习率。这个范围内的学习率越高,收敛得越快。这是来自 Leslie Smith 的论文的一个名为“LR range test”的想法,该想法通过fastai 库,后来被其他库采用,例如 ktrain 和 Amazon 的 Gluon 库。该图中的红点只是损失急剧下降的数值近似值,这对于自动化场景可能有用,但不一定是最好的。在此图中,红点代表曲线最陡的部分,这是从图中自动选择学习率的一种策略(无需目视检查)。其他自动化策略包括采用与最小损失相关的学习率并除以 10,并找到与 最长的山谷。
As the text from the
lr_find
method says, you can visually inspect the plot and choose a learning rate in a range where the loss is falling prior to divergence. A higher learning rate in this range will converge faster. This is an idea called an "LR range test" from Leslie Smith's paper that became popular through the fastai library and was later adopted by other libraries like ktrain and Amazon's Gluon library. The red dot in this plot is just a numerical approximation of where the loss is dramatically falling that may be useful for automated scenarios, but not necessarily the best. In this plot, the red dot represents the steepest part of the curve, which is one strategy to automatically select a learning rate from the plot (without visual inspection). Other automated strategies include taking the learning rate associated with the minimum loss and dividing by 10, and finding the learning rate associated with the longest valley.