Matlab 中的决策树

发布于 2024-08-16 05:27:04 字数 97 浏览 6 评论 0原文

我在Matlab中看到了帮助,但他们提供了一个示例,但没有解释如何使用“classregtree”函数中的参数。任何解释“classregtree”及其参数的使用的帮助将不胜感激。

I saw the help in Matlab, but they have provided an example without explaining how to use the parameters in the 'classregtree' function. Any help to explain the use of 'classregtree' with its parameters will be appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

魂归处 2024-08-23 05:27:04

函数 classregtree 的文档页面是不言自明的......

让我们回顾一下分类树模型的一些最常见的参数:

  • x:数据矩阵,行是实例,列是预测属性
  • y:列向量,每个实例的类标签
  • < strong>分类:指定哪些属性是离散类型(而不是连续)
  • 方法:是否生成分类或回归树(取决于类类型)
  • 名称:为属性命名
  • prune:启用/禁用减少错误修剪
  • minparent/minleaf:允许指定节点中实例的最小数量(如果要进一步) split
  • nvartosample:用于随机树(考虑每个节点处随机选择的 K 个属性)
  • 权重:指定加权实例
  • cost:指定成本矩阵(惩罚)的各种错误)
  • splitcriterion:用于在每次分割时选择最佳属性的标准。我只熟悉基尼指数,它是信息增益标准的变体。
  • priorprob:显式指定先验类别概率,而不是根据训练数据计算出

一个完整的示例来说明该过程:

%# load data
load carsmall

%# construct predicting attributes and target class
vars = {'MPG' 'Cylinders' 'Horsepower' 'Model_Year'};
x = [MPG Cylinders Horsepower Model_Year];  %# mixed continous/discrete data
y = cellstr(Origin);                        %# class labels

%# train classification decision tree
t = classregtree(x, y, 'method','classification', 'names',vars, ...
                'categorical',[2 4], 'prune','off');
view(t)

%# test
yPredicted = eval(t, x);
cm = confusionmat(y,yPredicted);           %# confusion matrix
N = sum(cm(:));
err = ( N-sum(diag(cm)) ) / N;             %# testing error

%# prune tree to avoid overfitting
tt = prune(t, 'level',3);
view(tt)

%# predict a new unseen instance
inst = [33 4 78 NaN];
prediction = eval(tt, inst)    %# pred = 'Japan'

tree


更新:

上述 classregtree class 已过时,并由 ClassificationTree< R2011a 中的 /a> 和 RegressionTree 类(请参阅 fitctreefitrtree 函数,R2014a 中的新增功能)。

这是使用新函数/类的更新示例:

t = fitctree(x, y, 'PredictorNames',vars, ...
    'CategoricalPredictors',{'Cylinders', 'Model_Year'}, 'Prune','off');
view(t, 'mode','graph')

y_hat = predict(t, x);
cm = confusionmat(y,y_hat);

tt = prune(t, 'Level',3);
view(tt)

predict(tt, [33 4 78 NaN])

The documentation page of the function classregtree is self-explanatory...

Lets go over some of the most common parameters of the classification tree model:

  • x: data matrix, rows are instances, cols are predicting attributes
  • y: column vector, class label for each instance
  • categorical: specify which attributes are discrete type (as opposed to continuous)
  • method: whether to produce classification or regression tree (depend on the class type)
  • names: gives names to the attributes
  • prune: enable/disable reduced-error pruning
  • minparent/minleaf: allows to specify min number of instances in a node if it is to be further split
  • nvartosample: used in random trees (consider K randomly chosen attributes at each node)
  • weights: specify weighted instances
  • cost: specify cost matrix (penalty of the various errors)
  • splitcriterion: criterion used to select the best attribute at each split. I'm only familiar with the Gini index which is a variation of the Information Gain criterion.
  • priorprob: explicitly specify prior class probabilities, instead of being calculated from the training data

A complete example to illustrate the process:

%# load data
load carsmall

%# construct predicting attributes and target class
vars = {'MPG' 'Cylinders' 'Horsepower' 'Model_Year'};
x = [MPG Cylinders Horsepower Model_Year];  %# mixed continous/discrete data
y = cellstr(Origin);                        %# class labels

%# train classification decision tree
t = classregtree(x, y, 'method','classification', 'names',vars, ...
                'categorical',[2 4], 'prune','off');
view(t)

%# test
yPredicted = eval(t, x);
cm = confusionmat(y,yPredicted);           %# confusion matrix
N = sum(cm(:));
err = ( N-sum(diag(cm)) ) / N;             %# testing error

%# prune tree to avoid overfitting
tt = prune(t, 'level',3);
view(tt)

%# predict a new unseen instance
inst = [33 4 78 NaN];
prediction = eval(tt, inst)    %# pred = 'Japan'

tree


Update:

The above classregtree class was made obsolete, and is superseded by ClassificationTree and RegressionTree classes in R2011a (see the fitctree and fitrtree functions, new in R2014a).

Here is the updated example, using the new functions/classes:

t = fitctree(x, y, 'PredictorNames',vars, ...
    'CategoricalPredictors',{'Cylinders', 'Model_Year'}, 'Prune','off');
view(t, 'mode','graph')

y_hat = predict(t, x);
cm = confusionmat(y,y_hat);

tt = prune(t, 'Level',3);
view(tt)

predict(tt, [33 4 78 NaN])
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文