SVM顺序最小优化的收敛问题

发布于 2024-08-27 13:20:51 字数 709 浏览 9 评论 0 原文

我从事支持向量机工作已经大约两个月了。我自己编写了SVM,对于SVM的优化问题,我使用了John Platt博士的顺序最小优化(SMO)。

现在我正处于通过网格搜索来为我的数据集找到最佳 C 值的阶段。 (请在此处查找我的项目应用程序和数据集详细信息 SVM 分类 - 每个类的最小输入集数量

我已成功检查了自定义实现的 SVM 对 C 值范围从 2^0 到 2^6 的准确性。但现在我遇到了一些关于 C> 的 SMO 收敛的问题。 128. 就像我试图找到 C=128 的 alpha 值一样,它需要很长时间才能真正收敛并成功给出 alpha 值。

当 C=100 时,SMO 收敛所需的时间约为 5 小时。我认为这个巨大(因为 SMO 应该很快。)尽管我得到了很好的准确性? 我被搞砸了,不是因为我无法测试较高 C 值的准确性。

我实际上显示的是 SMO 的每次传递中更改的 alpha 数量,并获得 10、13、8...alpha 不断变化。 KKT 条件确保收敛,那么这里发生了什么奇怪的事情呢?

请注意,尽管执行时间很长,但我的实现对于 C<=100 运行良好,并且具有良好的准确性。

请就这个问题给我意见。

谢谢你,干杯。

I have been working on Support Vector Machine for about 2 months now. I have coded SVM myself and for the optimization problem of SVM, I have used Sequential Minimal Optimization(SMO) by Dr. John Platt.

Right now I am in the phase where I am going to grid search to find optimal C value for my dataset. ( Please find details of my project application and dataset details here SVM Classification - minimum number of input sets for each class)

I have successfully checked my custom implemented SVM`s accuracy for C values ranging from 2^0 to 2^6. But now I am having some issues regarding the convergence of the SMO for C> 128.
Like I have tried to find the alpha values for C=128 and it is taking long time before it actually converges and successfully gives alpha values.

Time taken for the SMO to converge is about 5 hours for C=100. This huge I think ( because SMO is supposed to be fast. ) though I`m getting good accuracy?
I am screwed right not because I can not test the accuracy for higher values of C.

I am actually displaying number of alphas changed in every pass of SMO and getting 10, 13, 8... alphas changing continuously. The KKT conditions assures convergence so what is so weird happening here?

Please note that my implementation is working fine for C<=100 with good accuracy though the execution time is long.

Please give me inputs on this issue.

Thank You and Cheers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

带刺的爱情 2024-09-03 13:20:51

对于大多数 SVM 实现,训练时间会随着 C 值的增大而急剧增加。要了解 SMO 相当好的实现中的训练时间如何随 C 进行缩放,请查看下图中 libSVM 的对数刻度线。

SVM 训练时间与 C - 来自 Sentelle 等人的 一种用于 SVM 训练的快速修正单纯形方法


替代文本 http://dmcer.net/StackOverflowImages/svm_scaling.png


您可能有两种简单的方法和一种不太简单的方法来使事情变得更快。

让我们从简单的事情开始。首先,您可以尝试放宽收敛标准。像 epsilon = 0.001 这样的严格标准将需要更长的时间来训练,而通常产生的模型并不比 epsilon = 0.01 这样的宽松标准更好。其次,您应该尝试分析您的代码以查看是否存在任何明显的瓶颈。

不太容易的修复方法是切换到不同的优化算法(例如,来自 Sentelle 等人上述论文的 SVM-RSQP)。但是,如果您有 SMO 的有效实现,那么您可能只应该将其作为最后的手段。

For most SVM implementations, training time can increase dramatically with larger values of C. To get a sense of how training time in a reasonably good implementation of SMO scales with C, take a look at the log-scale line for libSVM in the graph below.

SVM training time vs. C - From Sentelle et al.'s A Fast Revised Simplex Method for SVM Training.


alt text http://dmcer.net/StackOverflowImages/svm_scaling.png


You probably have two easy ways and one not so easy way to make things faster.

Let's start with the easy stuff. First, you could try loosening your convergence criteria. A strict criteria like epsilon = 0.001 will take much longer to train, while typically resulting in a model that is no better than a looser criteria like epsilon = 0.01. Second, you should try to profile your code to see if there are any obvious bottlenecks.

The not so easy fix, would be to switch to a different optimization algorithm (e.g., SVM-RSQP from Sentelle et al.'s paper above). But, if you have a working implementation of SMO, you should probably only really do that as a last resort.

桃酥萝莉 2024-09-03 13:20:51

如果想要完全收敛,特别是C很大的话,需要很长的时间。可以考虑定义一个大的停止准则,并给出最大迭代次数,Libsvm中默认是1000000,如果迭代次数比较多,时间会成倍增加,但得不偿失,但结果可能不完全满足KKT条件,部分支持向量在带内,非支持向量在带外,但误差很小且可以接受。我认为,如果精度要求更高,建议使用其他二次规划算法而不是SMO算法

If you want complete convergence, especially if C is large, it takes a very long time.You can consider defining a large stop criterion, and give the maximum number of iterations, the default in Libsvm is 1000000, if the number of iterations is more, the time will multiply, but the loss is not worth the cost, but the result may not fully meet the KKT condition, some support vectors are in the band, non-support vectors are out of the band, but the error is small and acceptable.In my opinion, it is recommended to use other quadratic programming algorithms instead of SMO algorithm if the accuracy is higher

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文