什么是机器学习中的学习曲线?
我想知道机器学习的学习曲线是什么。绘制它的标准方法是什么?我的意思是我的图的 x 轴和 y 轴应该是什么?
I want to know what a learning curve in machine learning is. What is the standard way of plotting it? I mean what should be the x and y axis of my plot?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
它通常指的是预测精度/误差与训练集大小的关系图(即:随着数量的增加,模型在预测目标方面的表现如何?用于训练它的实例数)
通常训练和测试/验证性能都会绘制在一起,以便我们可以诊断偏差-方差权衡(即确定我们是否从增加更多训练中受益数据,并通过控制正则化或特征数量来评估模型复杂性)。
It usually refers to a plot of the prediction accuracy/error vs. the training set size (i.e: how better does the model get at predicting the target as you the increase number of instances used to train it)
Usually both the training and test/validation performance are plotted together so we can diagnose the bias-variance tradeoff (i.e determine if we benefit from adding more training data, and assess the model complexity by controlling regularization or number of features).
请注意,学习曲线和 ROC 曲线不是同义词。
正如该问题的其他答案所示,学习曲线通常在垂直轴上描述性能的改进,当存在另一个参数(在水平轴上)的变化,例如训练集大小(在机器学习中)或迭代/时间(在机器和生物学习中)。一个显着的一点是模型的许多参数在图上的不同点发生变化。这里的其他答案很好地说明了学习曲线。
(工业制造中的学习曲线还有另一个含义,起源于 20 世纪 30 年代的一项观察,即随着制造的单位数量翻倍,生产单个单位所需的劳动时间以均匀的速度减少。相关,但值得注意的是完整性并避免网络搜索中的混淆。)
相比之下,接受者操作特征曲线或ROC曲线并不显示学习;它显示了性能。 ROC 曲线是分类器性能的图形描述,显示随着分类器的判别阈值变化,增加真阳性率(在垂直轴上)和增加假阳性率(在水平轴上)之间的权衡。因此,只有与模型相关的单个参数(决策/辨别阈值)在图上的不同点发生变化。此 ROC 曲线(来自维基百科)显示了三种不同分类器的性能。
这里没有描述学习,而是描述两个不同类别的性能随着分类器的决策阈值变得更加宽松/严格,成功/错误的概率会增加。通过查看曲线下的面积,我们可以看到分类器区分类别的能力的总体指示。此曲线下面积指标对两个类别中的成员数量不敏感,因此如果类别成员不平衡,它可能无法反映实际性能。 ROC曲线有很多小标题,有兴趣的读者可以看看:
Fawcett,汤姆. “ROC 图:研究人员的注释和实际考虑因素。”机器学习 31 (2004):1-38。
Swets,约翰 A .、罗宾·道斯 (Robyn M. Dawes) 和约翰·莫纳汉 (John Monahan)。 “通过科学做出更好的决定。” 《科学美国人》(2000):83。
Notice that learning curve and ROC curve are not synonymous.
As indicated in the other answers to this question, a learning curve conventionally depicts improvement in performance on the vertical axis when there are changes in another parameter (on the horizontal axis), such as training set size (in machine learning) or iteration/time (in both machine and biological learning). One salient point is that many parameters of the model are changing at different points on the plot. Other answers here have done a great job of illustrating learning curves.
(There is also another meaning of learning curve in industrial manufacturing, originating in an observation in the 1930s that the number of labor hours needed to produce an individual unit decreases at a uniform rate as the quantity of units manufactured doubles. It isn't really relevant but is worth noting for completeness and to avoid confusion in web searches.)
In contrast, Receiver Operating Characteristic curve, or ROC curve, does not show learning; it shows performance. An ROC curve is a graphical depiction of classifier performance that shows the trade-off between increasing true positive rates (on the vertical axis) and increasing false positive rates (on the horizontal axis) as the discrimination threshold of the classifier is varied. Thus, only a single parameter (the decision / discrimination threshold) associated with the model is changing at different points on the plot. This ROC curve (from Wikipedia) shows performance of three different classifiers.
There is no learning being depicted here, but rather performance with respect to two different classes of success/error as the classifier's decision threshold is made more lenient/strict. By looking at the area under the curve, we can see an overall indication of the ability of the classifier to distinguish the classes. This area-under-the-curve metric is insensitive to the number of members in the two classes, so it may not reflect actual performance if class membership is unbalanced. The ROC curve has many subtitles and interested readers might check out:
Fawcett, Tom. "ROC graphs: Notes and practical considerations for researchers." Machine Learning 31 (2004): 1-38.
Swets, John A., Robyn M. Dawes, and John Monahan. "Better decisions through Science." Scientific American (2000): 83.
有些人用“学习曲线”来指代迭代过程的误差作为迭代次数的函数,即它说明了某些效用函数的收敛性。在下面的示例中,我将最小均方 (LMS) 算法的均方误差 (MSE) 绘制为迭代次数的函数。这说明了 LMS“学习”(在本例中)信道脉冲响应的速度有多快。
Some people use "learning curve" to refer to the error of an iterative procedure as a function of the iteration number, i.e., it illustrates convergence of some utility function. In the example below, I plot mean-square error (MSE) of the least-mean-square (LMS) algorithm as a function of the iteration number. That illustrates how quickly LMS "learns", in this case, the channel impulse response.
基本上,机器学习曲线可以让您找到算法开始学习的点。如果你取一条曲线,然后在导数开始达到常数时切出斜率切线,那么它就开始建立学习能力。
根据 x 轴和 y 轴的映射方式,其中一个轴将开始接近恒定值,而另一个轴的值将不断增加。这是你开始看到一些学习的时候。整条曲线几乎可以让您测量算法的学习速度。最大点通常是当坡度开始后退时。您可以对最大/最小点采取多种衍生措施。
所以从上面的例子你可以看到曲线逐渐趋向于一个恒定值。它最初开始通过训练示例来利用其学习,并且斜率在最大/最小点处变宽,在该点处它倾向于越来越接近恒定状态。此时,它能够从测试数据中选取新的示例,并从数据中找到新的、独特的结果。
您将有这样的 x/y 轴度量来衡量历元与错误。
Basically, a machine learning curve allows you to find the point from which the algorithm starts to learn. If you take a curve and then slice a slope tangent for derivative at the point that it starts to reach constant is when it starts to build its learning ability.
Depending on how your x and y axis are mapped, one of your axis will start to approach a constant value while the other axis's values will keep increasing. This is when you start seeing some learning. The whole curve pretty much allows you to measure the rate at which your algorithm is able to learn. The maximum point is usually when the slope starts to recede. You can take a number of derivative measures to the maximum/minimum point.
So from the above examples you can see that the curve is gradually tending towards a constant value. It initially starts to harness its learning through the training examples and the slope widens at maximum/mimimum point where it tends to approach closer and closer towards the constant state. At this point it is able to pick up new examples from test data and find new and unique results from data.
You would have such x/y axis measures for epochs vs error.
在安德鲁的机器学习课程中,学习曲线是训练/交叉验证误差与样本大小的关系图。学习曲线可以用来检测模型是否存在高偏差或高方差。如果模型存在高偏差问题,随着样本量的增加,训练误差会增大,交叉验证误差会减小,最后它们会非常接近,但训练误差和分类误差仍然处于很高的错误率。并且增加样本量对于高偏差问题没有太大帮助。
如果模型存在高方差,随着样本量的不断增加,训练误差将不断增加,交叉验证误差将不断减小,最终训练和交叉验证错误率会较低。因此,如果模型存在高方差,更多样本将有助于提高模型预测性能。
In Andrew's machine learning class, a learning curve is the plot of the training/cross-validation error versus the sample size. The learning curve can be used to detect whether the model has the high bias or high variance. If the model suffers from high bias problem, as the sample size increases, training error will increase and the cross validation error will decrease and at last they will be very close to each other but still at a high error rate for both training and classification error. And increasing the sample size will not help much for high bias problem.
If the model suffers from high variance, as the keep increasing the sample size, the training error will keep increasing and cross-validation error will keep decreasing and they will end up at a low training and cross-validation error rate. So more samples will help to improve the model prediction performance if the model suffer from high variance.
对于给定的模型,如何确定更多的训练点是否会有帮助?对此的一个有用的诊断是学习曲线。
• 预测精度/误差与训练集大小的关系图(即:随着用于训练模型的实例数量的增加,模型预测目标的效果如何)
• 学习曲线通常描述垂直方向性能的改进当另一个参数(在水平轴上)发生变化时,例如训练集大小(在机器学习中)或迭代/时间
曲线通常可用于绘制算法健全性检查或提高性能。
,学习 帮助诊断你的算法将遇到的问题 就
个人而言,以下两个链接帮助我更好地理解这个概念
学习曲线
Sklearn 学习曲线
How can you determine for a given model whether more training points will be helpful? A useful diagnostic for this are learning curves.
• Plot of the prediction accuracy/error vs. the training set size (i.e.: how better does the model get at predicting the target as you the increase number of instances used to train it)
• Learning curve conventionally depicts improvement in performance on the vertical axis when there are changes in another parameter (on the horizontal axis), such as training set size (in machine learning) or iteration/time
• A learning curve is often useful to plot for algorithmic sanity checking or improving performance
• Learning curve plotting can help diagnose the problems your algorithm will be suffering from
Personally, the below two links helped me to understand better about this concept
Learning Curve
Sklearn Learning Curve
使用此代码来绘制:
请注意,history = model.fit(...)
use this code to plot :
note that history = model.fit(...)
它是一个图表,用于比较模型在不断变化的训练实例数量上准备和测试数据的性能,并且这些通常用作机器学习中的分析工具,用于从训练数据集增量学习的计算。它使我们能够验证模型何时能够尽可能多地学习数据。
学习曲线吸收信息有三种期望
It is a Graph that compares the performance of a model on preparing and testing data over a changing number of training instances and these are a generally utilized as analytic instrument in machine learning for calculations that learn from a training dataset incrementally. It allows us to verify when a model has learning as much as it can about the data.
There are three kinds of expectations to Learning curves absorb information
简单来说,学习曲线是实例数量和损失或准确性等指标之间的图。该图显示了随着经验的积累而学习的过程,因此被称为学习曲线。
学习曲线广泛应用于机器学习中随着时间的推移逐步学习(优化其内部参数)的算法,例如深度学习神经网络。
In simple terms, the learning curve is a plot between the number of instances and a metric such as loss or accuracy. This plot shows the journey learning with the gain of experience and hence is named learning curve.
Learning curves are widely used in machine learning for algorithms that learn (optimize their internal parameters) incrementally over time, such as deep learning neural networks.
例子
X= 水平
y=工资
X Y
0 2000
2 4000
4 6000
6 8000
回归给出的准确度为 75% 这是一条州线
由于曲线,多项式的准确度为 85%
Example
X= Level
y=salary
X Y
0 2000
2 4000
4 6000
6 8000
Regression gives accuracy 75% it is a state line
polynomial gives accuracy 85% because of the curve