Predict.svm 不预测新数据
不幸的是,我在以下简单示例中使用predict()时遇到问题:
library(e1071)
x <- c(1:10)
y <- c(0,0,0,0,1,0,1,1,1,1)
test <- c(11:15)
mod <- svm(y ~ x, kernel = "linear", gamma = 1, cost = 2, type="C-classification")
predict(mod, newdata = test)
结果如下:
> predict(mod, newdata = test)
1 2 3 4 <NA> <NA> <NA> <NA> <NA> <NA>
0 0 0 0 0 1 1 1 1 1
任何人都可以解释为什么predict()只给出训练样本(x,y)的拟合值而不关心测试数据?
非常感谢您的帮助!
理查德
unfortunately I have problems using predict() in the following simple example:
library(e1071)
x <- c(1:10)
y <- c(0,0,0,0,1,0,1,1,1,1)
test <- c(11:15)
mod <- svm(y ~ x, kernel = "linear", gamma = 1, cost = 2, type="C-classification")
predict(mod, newdata = test)
The result is as follows:
> predict(mod, newdata = test)
1 2 3 4 <NA> <NA> <NA> <NA> <NA> <NA>
0 0 0 0 0 1 1 1 1 1
Can anybody explain why predict() only gives the fitted values of the training sample (x,y) and does not care about the test-data?
Thank you very much for your help!
Richard
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
看起来这是因为您滥用了
svm()
的公式接口。通常,人们会提供一个数据框或类似对象,在其中搜索公式中的变量。如果您不这样做,通常也没关系,即使这不是最佳实践,但当您想要预测时,不将变量放入数据框中会让您陷入混乱。它返回训练数据的原因是因为您没有提供newdata
一个包含名为x
组件的对象。因此它无法找到新数据x
,因此返回拟合值。这对于我所知道的大多数 Rpredict
方法来说很常见。解决方案是 i) 将训练数据放入数据框中,并将其作为
data
参数传递给svm
,并且 ii) 提供一个包含的新数据框x
(从test
)到predict()
。例如:It looks like this is because you misuse the formula interface to
svm()
. Normally, one supplies a data frame or similar object within which the variables in the formula are searched for. It usually doesn't matter if you don't do this, even if it is not best practice, but when you want to predict, not putting variables in a data frame gets you in a right mess. The reason it returns the training data is because you don't providenewdata
an object with a component namedx
in it. Hence it can't find the new datax
so returns the fitted values. This is common for most Rpredict
methods I know.The solution then is to i) put your training data in a data frame and pass
svm
this as thedata
argument, and ii) supply a new data frame containingx
(fromtest
) topredict()
. E.g.:您需要 newdata 具有相同的形式,即使用 data.frame 有帮助:
顺便说一下,这也显示了
svm()
的帮助页面:所以总而言之,使用公式接口和提供一个 data.frame——这就是 R 中所有建模函数的工作原理。
You need newdata to be of the same form, ie using a data.frame helps:
By the way, this is also shown the help page for
svm()
:So in sum, use the formula interface and supply a data.frame --- that is how essentially all modeling functions in R work.