支持向量机 - 一个简单的解释?
所以,我试图理解 SVM 算法是如何工作的,但我只是不知道如何转换 n 维平面点中的一些数据集,这些数据集具有数学意义,以便通过超平面分离点并对它们进行分类。
这里有一个示例,他们试图对以下内容的图片进行分类老虎和大象,他们说“我们将它们数字化为 100x100 像素图像,因此我们在 n 维平面上有 x,其中 n=10,000”,但我的问题是他们如何转换实际上仅代表点中的一些颜色代码的矩阵具有数学意义以便将它们分为两类?
也许有人可以在 2D 示例中向我解释这一点,因为我看到的任何图形表示都只是 2D,而不是 nD。
So, i'm trying to understand how the SVM algorithm works but i just cannot figure out how you transform some datasets in points of n-dimensional plane that would have a mathematical meaning in order to separate the points through a hyperplane and clasify them.
There's an example here, they are trying to clasify pictures of tigers and elephants, they say "We digitize them into 100x100 pixel images, so we have x in n-dimensional plane, where n=10,000", but my question is how do they transform the matrices that actually represent just some color codes IN points that have a methematical meaning in order to clasify them in 2 categories?
Probably someone can explain me this in a 2D example because any graphical representation i see it's just 2D, not nD.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
简短的答案是:它们不会转换矩阵,而是将矩阵中的每个元素视为一个维度(在机器学习中,每个元素将被称为特征)。
因此,他们需要对每个具有 100x100 = 10000 个特征的元素进行分类。在线性 SVM 情况下,他们使用超平面来实现这一点,将 10,000 维空间分为两个不同的区域。
更长的答案是:
考虑您的 2D 案例。现在,您想要分离一组二维元素。这意味着集合中的每个元素都可以在数学上描述为 2 元组,即:e = (x1, x2)。例如,在您的图中,一些完整的点可能是:{(1,3), (2,4)},一些空心的点可能是{(4,2), (5,1)}。请注意,为了使用线性分类器对它们进行分类,您需要一个二维线性分类器,这将产生如下所示的决策规则:
请注意,分类器是线性,因为它是 e 元素的线性组合。 “w”称为“权重”,“C”是决策阈值。如上所述,具有 2 个元素的线性函数只是一条线,这就是为什么在你的图中 H 是线。
现在,回到我们的 n 维情况,您可能会认为一条线无法解决问题。在 3D 情况下,我们需要一个平面: (w1 * x1 + w2 * x2 + w2 * x3) > C,在 n 维情况下,我们需要一个超平面: (w1 * x1 + w2 * x2 + ... + wn * xn) > C,这实在是难以想象,但更难画:-)。
The short answer is: they don't transform the matrices, but treat each element in the matrix as a dimension (in machine learning each element would be called a Feature).
Thus, they need classify elements with 100x100 = 10000 features each. In the linear SVM case, they do so using a hyperplane, which divides the 10,000-dimensional space into two distinct regions.
A longer answer would be:
Consider your 2D case. Now, you want to separate a set of two-dimensional elements. This means that each element in your set can be described mathematically as a 2-tuple, namely: e = (x1, x2). For example, in your figure, some full dots might be: {(1,3), (2,4)}, and some hollow ones might be {(4,2), (5,1)}. Note that in order to classify them with a linear classifier, you need a 2-dimensional linear classifier, which would yield a decision rule which might look like this:
Note that the classifier is linear, as it is a linear combination of the elements of e. The 'w's are called 'weights', and 'C' is the decision threshold. a linear function with 2-elements as above is simply a line, that's why in your figures the H's are lines.
Now, back to our n-dimensional case, you can probably figure our that a line will not do the trick. In the 3D case, we would need a plane: (w1 * x1 + w2 * x2 + w2 * x3) > C, and in the n-dimensional case, we would need a hyperplane: (w1 * x1 + w2 * x2 + ... + wn * xn) > C, which is damn hard to imagine, none the less to draw :-).