顺序特征选择Matlab
有人能解释一下如何在Matlab中使用这个函数吗 “sequentialfs”
看起来很直接,但我不知道我们如何为它设计一个函数处理程序?!
有什么线索吗?
Can somebody explain how to use this function in Matlab
"sequentialfs"
it looks straight forward but I do not know how can we design a function handler for it?!
any clue?!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一个比文档中的示例更简单的示例。
首先让我们创建一个非常简单的数据集。我们有一些类标签
y
。 500 个来自0
类,500 个来自1
类,并且它们是随机排序的。我们有 100 个变量
x
,我们想用它们来预测y
。其中 99 个只是随机噪声,但其中一个与类别标签高度相关。现在假设我们要使用线性判别分析对点进行分类。如果我们要直接执行此操作而不应用任何特征选择,我们首先将数据分为训练集和测试集:
然后我们将对它们进行分类:
的错误率:
最后我们将测量预测 这样我们就得到了完美的分类。
要使函数句柄与
sequentialfs
一起使用,只需将这些部分放在一起:并将它们全部一起传递到
sequentialfs
中:最终的1输出中的内容表明,正如预期的那样,变量 100 是
x
中变量中y
的最佳预测变量。sequentialfs
文档中的示例稍微复杂一些,主要是因为预测的类标签是字符串而不是上面的数值,因此使用~strcmp
来计算错误率而不是~=
。此外,它利用交叉验证来估计错误率,而不是如上所述的直接评估。Here's a simpler example than the one in the documentation.
First let's create a very simple dataset. We have some class labels
y
. 500 are from class0
, and 500 are from class1
, and they are randomly ordered.And we have 100 variables
x
that we want to use to predicty
. 99 of them are just random noise, but one of them is highly correlated with the class label.Now let's say we want to classify the points using linear discriminant analysis. If we were to do this directly without applying any feature selection, we would first split the data up into a training set and a test set:
Then we would classify them:
And finally we would measure the error rate of the prediction:
and in this case we get perfect classification.
To make a function handle to be used with
sequentialfs
, just put these pieces together:And pass all of them together into
sequentialfs
:The final
1
in the output indicates that variable 100 is, as expected, the best predictor ofy
among the variables inx
.The example in the documentation for
sequentialfs
is a little more complex, mostly because the predicted class labels are strings rather than numerical values as above, so~strcmp
is used to calculate the error rate rather than~=
. In addition it makes use of cross-validation to estimate the error rate, rather than direct evaluation as above.