当前位置：文江博客话题详情

使用 Python 绑定 SVM 库 LIBSVM 的示例

发布于 2024-10-03 03:21:27 字数 80 浏览 0 评论 0原文

我迫切需要一个在 python 中使用 LibSVM 的分类任务示例。我不知道输入应该是什么样子，也不知道哪个函数负责训练，哪个函数负责测试谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

只有一腔孤勇 2024-10-10 03:21:27

此处列出的代码示例不适用于 LibSVM 3.1，因此我或多或少移植了 mossplix 的示例：

from svmutil import *
svm_model.predict = lambda self, x: svm_predict([0], [x], self)[0][0]

prob = svm_problem([1,-1], [[1,0,1], [-1,0,-1]])

param = svm_parameter()
param.kernel_type = LINEAR
param.C = 10

m=svm_train(prob, param)

m.predict([1,1,1])

The code examples listed here don't work with LibSVM 3.1, so I've more or less ported the example by mossplix:

from svmutil import *
svm_model.predict = lambda self, x: svm_predict([0], [x], self)[0][0]

prob = svm_problem([1,-1], [[1,0,1], [-1,0,-1]])

param = svm_parameter()
param.kernel_type = LINEAR
param.C = 10

m=svm_train(prob, param)

m.predict([1,1,1])

回复收藏 0 原文

莫相离 2024-10-10 03:21:27

此示例演示了单类SVM 分类器；它尽可能简单，同时仍然显示完整的 LIBSVM 工作流程。

第 1 步：导入 NumPy 和LIBSVM

  import numpy as NP
    from svm import *

第 2 步：生成合成数据：在本例中，给定边界内的 500 个点（注意：相当多的真实数据集是LIBSVM 网站上提供）

Data = NP.random.randint(-5, 5, 1000).reshape(500, 2)

步骤3：现在，为一类分类器选择一些非线性决策边界：

rx = [ (x**2 + y**2) < 9 and 1 or 0 for (x, y) in Data ]

第 4 步：接下来，根据此决策边界任意划分数据：

I 类：位于上或的数据在任意圆
II类：决策边界（圆圈）外部的所有点

SVM 模型构建从这里开始；在此之前的所有步骤都只是准备一些综合数据。

第 5 步：通过调用 构建问题描述 svm_problem，传入决策边界函数和数据，然后将此结果绑定到变量。

px = svm_problem(rx, Data)

第 6 步：为非线性映射选择核函数

对于此示例，我选择了 RBF（径向基函数）作为我的核函数

pm = svm_parameter(kernel_type=RBF)

第 7 步：训练分类器，
通过调用svm_model，传入问题描述 (px) & 内核（下午）

v = svm_model(px, pm)

第 8 步：最后，通过在训练的模型对象上调用预测来测试训练的分类器（ 'v'）

v.predict([3, 1])
# returns the class label (either '1' or '0')

对于上面的示例，我使用了LIBSVM版本3.0（当时的当前稳定版本这个答案< /em> 已发布）。

最后，关于您问题中有关内核函数选择的部分，支持向量机不特定于特定内核函数——例如，我可以选择不同的内核（高斯、多项式等）。

LIBSVM 包含所有最常用的内核函数——这是一个很大的帮助，因为您可以看到所有可能的替代方案，并选择一个用于您的模型，只需调用 svm_parameter 并传递kernel_type 的值（所选内核的三个字母缩写）。

最后，您选择用于训练的核函数必须与针对测试数据使用的核函数相匹配。

This example demonstrates a one-class SVM classifier; it's about as simple as possible while still showing the complete LIBSVM workflow.

Step 1: Import NumPy & LIBSVM

  import numpy as NP
    from svm import *

Step 2: Generate synthetic data: for this example, 500 points within a given boundary (note: quite a few real data sets are are provided on the LIBSVM website)

Data = NP.random.randint(-5, 5, 1000).reshape(500, 2)

Step 3: Now, choose some non-linear decision boundary for a one-class classifier:

rx = [ (x**2 + y**2) < 9 and 1 or 0 for (x, y) in Data ]

Step 4: Next, arbitrarily partition the data w/r/t this decision boundary:

Class I: those that lie on or within an arbitrary circle
Class II: all points outside the decision boundary (circle)

The SVM Model Building begins here; all steps before this one were just to prepare some synthetic data.

Step 5: Construct the problem description by calling svm_problem, passing in the decision boundary function and the data, then bind this result to a variable.

px = svm_problem(rx, Data)

Step 6: Select a kernel function for the non-linear mapping

For this exmaple, i chose RBF (radial basis function) as my kernel function

pm = svm_parameter(kernel_type=RBF)

Step 7: Train the classifier,
by calling svm_model, passing in the problem description (px) & kernel (pm)

v = svm_model(px, pm)

Step 8: Finally, test the trained classifier by calling predict on the trained model object ('v')

v.predict([3, 1])
# returns the class label (either '1' or '0')

For the example above, I used version 3.0 of LIBSVM (the current stable release at the time this answer was posted).

Finally, w/r/t the part of your question regarding the choice of kernel function, Support Vector Machines are not specific to a particular kernel function--e.g., i could have chosen a different kernel (gaussian, polynomial, etc.).

LIBSVM includes all of the most commonly used kernel functions--which is a big help because you can see all plausible alternatives and to select one for use in your model, is just a matter of calling svm_parameter and passing in a value for kernel_type (a three-letter abbreviation for the chosen kernel).

Finally, the kernel function you choose for training must match the kernel function used against the testing data.

回复收藏 0 原文

并安 2024-10-10 03:21:27

LIBSVM 从包含两个列表的元组中读取数据。第一个列表包含类，第二个列表包含输入数据。创建具有两个可能的类的简单数据集
您还需要通过创建 svm_parameter 来指定要使用的内核。


>> from libsvm import *
>> prob = svm_problem([1,-1],[[1,0,1],[-1,0,-1]])
>> param = svm_parameter(kernel_type = LINEAR, C = 10)
  ## training  the model
>> m = svm_model(prob, param)
#testing the model
>> m.predict([1, 1, 1])

LIBSVM reads the data from a tuple containing two lists. The first list contains the classes and the second list contains the input data. create simple dataset with two possible classes
you also need to specify which kernel you want to use by creating svm_parameter.


>> from libsvm import *
>> prob = svm_problem([1,-1],[[1,0,1],[-1,0,-1]])
>> param = svm_parameter(kernel_type = LINEAR, C = 10)
  ## training  the model
>> m = svm_model(prob, param)
#testing the model
>> m.predict([1, 1, 1])

回复收藏 0 原文

等待圉鍢 2024-10-10 03:21:27

您可能会考虑使用

http://scikit-learn.sourceforge.net/

它有一个很棒的 python 绑定libsvm 的并且应该很容易安装

回复收藏 0 原文

沫离伤花 2024-10-10 03:21:27

添加到 @shinNoNoir ：

param.kernel_type 表示您要使用的内核函数的类型，
0：线性
1：多项式
2：径向基函数
3：Sigmoid

另请记住，svm_problem(y,x)：这里 y 是类标签，x 是类实例，x 和 y 只能是列表、元组和字典。（没有 numpy 数组）

回复收藏 0 原文

执着的年纪 2024-10-10 03:21:27

通过 SciKit-learn 的 SVM：

from sklearn.svm import SVC
X = [[0, 0], [1, 1]]
y = [0, 1]
model = SVC().fit(X, y)

tests = [[0.,0.], [0.49,0.49], [0.5,0.5], [2., 2.]]
print(model.predict(tests))
# prints [0 0 1 1]

有关更多详细信息，请访问：http://scikit-learn。 org/stable/modules/svm.html#svm

SVM via SciKit-learn:

from sklearn.svm import SVC
X = [[0, 0], [1, 1]]
y = [0, 1]
model = SVC().fit(X, y)

tests = [[0.,0.], [0.49,0.49], [0.5,0.5], [2., 2.]]
print(model.predict(tests))
# prints [0 0 1 1]

For more details here: http://scikit-learn.org/stable/modules/svm.html#svm

回复收藏 0 原文

公布 2024-10-10 03:21:27

这是我混搭的一个虚拟示例：

import numpy
import matplotlib.pyplot as plt
from random import seed
from random import randrange

import svmutil as svm

seed(1)

# Creating Data (Dense)
train = list([randrange(-10, 11), randrange(-10, 11)] for i in range(10))
labels = [-1, -1, -1, 1, 1, -1, 1, 1, 1, 1]
options = '-t 0'  # linear model
# Training Model
model = svm.svm_train(labels, train, options)


# Line Parameters
w = numpy.matmul(numpy.array(train)[numpy.array(model.get_sv_indices()) - 1].T, model.get_sv_coef())
b = -model.rho.contents.value
if model.get_labels()[1] == -1:  # No idea here but it should be done :|
    w = -w
    b = -b

print(w)
print(b)

# Plotting
plt.figure(figsize=(6, 6))
for i in model.get_sv_indices():
    plt.scatter(train[i - 1][0], train[i - 1][1], color='red', s=80)
train = numpy.array(train).T
plt.scatter(train[0], train[1], c=labels)
plt.plot([-5, 5], [-(-5 * w[0] + b) / w[1], -(5 * w[0] + b) / w[1]])
plt.xlim([-13, 13])
plt.ylim([-13, 13])
plt.show()

Here is a dummy example I mashed up:

import numpy
import matplotlib.pyplot as plt
from random import seed
from random import randrange

import svmutil as svm

seed(1)

# Creating Data (Dense)
train = list([randrange(-10, 11), randrange(-10, 11)] for i in range(10))
labels = [-1, -1, -1, 1, 1, -1, 1, 1, 1, 1]
options = '-t 0'  # linear model
# Training Model
model = svm.svm_train(labels, train, options)


# Line Parameters
w = numpy.matmul(numpy.array(train)[numpy.array(model.get_sv_indices()) - 1].T, model.get_sv_coef())
b = -model.rho.contents.value
if model.get_labels()[1] == -1:  # No idea here but it should be done :|
    w = -w
    b = -b

print(w)
print(b)

# Plotting
plt.figure(figsize=(6, 6))
for i in model.get_sv_indices():
    plt.scatter(train[i - 1][0], train[i - 1][1], color='red', s=80)
train = numpy.array(train).T
plt.scatter(train[0], train[1], c=labels)
plt.plot([-5, 5], [-(-5 * w[0] + b) / w[1], -(5 * w[0] + b) / w[1]])
plt.xlim([-13, 13])
plt.ylim([-13, 13])
plt.show()

回复收藏 0 原文

灵芸 2024-10-10 03:21:27

param = svm_parameter('-s 0 -t 2 -d 3 -c '+str(C)+' -g '+str(G)+' -p '+str(self.epsilon)+' -n '+str(self.nu))

我不知道早期版本，但在 LibSVM 3.xx 中，方法 svm_parameter('options') 仅接受一个参数。

在我的例子中，C、G、p 和 nu 是动态值。您根据您的代码进行更改。

选项：

    -s svm_type : set type of SVM (default 0)
        0 -- C-SVC      (multi-class classification)
        1 -- nu-SVC     (multi-class classification)
        2 -- one-class SVM
        3 -- epsilon-SVR    (regression)
        4 -- nu-SVR     (regression)
    -t kernel_type : set type of kernel function (default 2)
        0 -- linear: u'*v
        1 -- polynomial: (gamma*u'*v + coef0)^degree
        2 -- radial basis function: exp(-gamma*|u-v|^2)
        3 -- sigmoid: tanh(gamma*u'*v + coef0)
        4 -- precomputed kernel (kernel values in training_set_file)
    -d degree : set degree in kernel function (default 3)
    -g gamma : set gamma in kernel function (default 1/num_features)
    -r coef0 : set coef0 in kernel function (default 0)
    -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
    -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
    -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
    -m cachesize : set cache memory size in MB (default 100)
    -e epsilon : set tolerance of termination criterion (default 0.001)
    -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
    -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
    -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
    -v n: n-fold cross validation mode
    -q : quiet mode (no outputs)

文档来源： https://www.csie.ntu.edu.tw /~cjlin/libsvm/

param = svm_parameter('-s 0 -t 2 -d 3 -c '+str(C)+' -g '+str(G)+' -p '+str(self.epsilon)+' -n '+str(self.nu))

I don't know about the earlier versions but in LibSVM 3.xx the method svm_parameter('options') will takes just one argument.

In my case C, G, p and nu are the dynamic values. You make changes according to your code.

options:

    -s svm_type : set type of SVM (default 0)
        0 -- C-SVC      (multi-class classification)
        1 -- nu-SVC     (multi-class classification)
        2 -- one-class SVM
        3 -- epsilon-SVR    (regression)
        4 -- nu-SVR     (regression)
    -t kernel_type : set type of kernel function (default 2)
        0 -- linear: u'*v
        1 -- polynomial: (gamma*u'*v + coef0)^degree
        2 -- radial basis function: exp(-gamma*|u-v|^2)
        3 -- sigmoid: tanh(gamma*u'*v + coef0)
        4 -- precomputed kernel (kernel values in training_set_file)
    -d degree : set degree in kernel function (default 3)
    -g gamma : set gamma in kernel function (default 1/num_features)
    -r coef0 : set coef0 in kernel function (default 0)
    -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
    -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
    -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
    -m cachesize : set cache memory size in MB (default 100)
    -e epsilon : set tolerance of termination criterion (default 0.001)
    -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
    -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
    -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
    -v n: n-fold cross validation mode
    -q : quiet mode (no outputs)

Source of documentation: https://www.csie.ntu.edu.tw/~cjlin/libsvm/

回复收藏 0 原文

~没有更多了~