当前位置：文江博客话题详情

Matlab machine-learning decision-tree

Matlab 中的决策树

发布于 2024-08-16 05:27:04 字数 97 浏览 8 评论 0原文

我在Matlab中看到了帮助，但他们提供了一个示例，但没有解释如何使用“classregtree”函数中的参数。任何解释“classregtree”及其参数的使用的帮助将不胜感激。

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（1）

魂归处 2024-08-23 05:27:04

函数 classregtree 的文档页面是不言自明的......

让我们回顾一下分类树模型的一些最常见的参数：

x：数据矩阵，行是实例，列是预测属性
y：列向量，每个实例的类标签
< strong>分类：指定哪些属性是离散类型（而不是连续）
方法：是否生成分类或回归树（取决于类类型）
名称：为属性命名
prune：启用/禁用减少错误修剪
minparent/minleaf：允许指定节点中实例的最小数量（如果要进一步） split
nvartosample：用于随机树（考虑每个节点处随机选择的 K 个属性）
权重：指定加权实例
cost：指定成本矩阵（惩罚）的各种错误）
splitcriterion：用于在每次分割时选择最佳属性的标准。我只熟悉基尼指数，它是信息增益标准的变体。
priorprob：显式指定先验类别概率，而不是根据训练数据计算出

一个完整的示例来说明该过程：

%# load data
load carsmall

%# construct predicting attributes and target class
vars = {'MPG' 'Cylinders' 'Horsepower' 'Model_Year'};
x = [MPG Cylinders Horsepower Model_Year];  %# mixed continous/discrete data
y = cellstr(Origin);                        %# class labels

%# train classification decision tree
t = classregtree(x, y, 'method','classification', 'names',vars, ...
                'categorical',[2 4], 'prune','off');
view(t)

%# test
yPredicted = eval(t, x);
cm = confusionmat(y,yPredicted);           %# confusion matrix
N = sum(cm(:));
err = ( N-sum(diag(cm)) ) / N;             %# testing error

%# prune tree to avoid overfitting
tt = prune(t, 'level',3);
view(tt)

%# predict a new unseen instance
inst = [33 4 78 NaN];
prediction = eval(tt, inst)    %# pred = 'Japan'

tree

更新：

上述 classregtree class 已过时，并由 ClassificationTree< R2011a 中的 /a> 和 RegressionTree 类（请参阅 fitctree 和 fitrtree 函数，R2014a 中的新增功能）。

这是使用新函数/类的更新示例：

t = fitctree(x, y, 'PredictorNames',vars, ...
    'CategoricalPredictors',{'Cylinders', 'Model_Year'}, 'Prune','off');
view(t, 'mode','graph')

y_hat = predict(t, x);
cm = confusionmat(y,y_hat);

tt = prune(t, 'Level',3);
view(tt)

predict(tt, [33 4 78 NaN])

The documentation page of the function classregtree is self-explanatory...

Lets go over some of the most common parameters of the classification tree model:

x: data matrix, rows are instances, cols are predicting attributes
y: column vector, class label for each instance
categorical: specify which attributes are discrete type (as opposed to continuous)
method: whether to produce classification or regression tree (depend on the class type)
names: gives names to the attributes
prune: enable/disable reduced-error pruning
minparent/minleaf: allows to specify min number of instances in a node if it is to be further split
nvartosample: used in random trees (consider K randomly chosen attributes at each node)
weights: specify weighted instances
cost: specify cost matrix (penalty of the various errors)
splitcriterion: criterion used to select the best attribute at each split. I'm only familiar with the Gini index which is a variation of the Information Gain criterion.
priorprob: explicitly specify prior class probabilities, instead of being calculated from the training data

A complete example to illustrate the process:

%# load data
load carsmall

%# construct predicting attributes and target class
vars = {'MPG' 'Cylinders' 'Horsepower' 'Model_Year'};
x = [MPG Cylinders Horsepower Model_Year];  %# mixed continous/discrete data
y = cellstr(Origin);                        %# class labels

%# train classification decision tree
t = classregtree(x, y, 'method','classification', 'names',vars, ...
                'categorical',[2 4], 'prune','off');
view(t)

%# test
yPredicted = eval(t, x);
cm = confusionmat(y,yPredicted);           %# confusion matrix
N = sum(cm(:));
err = ( N-sum(diag(cm)) ) / N;             %# testing error

%# prune tree to avoid overfitting
tt = prune(t, 'level',3);
view(tt)

%# predict a new unseen instance
inst = [33 4 78 NaN];
prediction = eval(tt, inst)    %# pred = 'Japan'

tree

Update:

The above classregtree class was made obsolete, and is superseded by ClassificationTree and RegressionTree classes in R2011a (see the fitctree and fitrtree functions, new in R2014a).

Here is the updated example, using the new functions/classes:

t = fitctree(x, y, 'PredictorNames',vars, ...
    'CategoricalPredictors',{'Cylinders', 'Model_Year'}, 'Prune','off');
view(t, 'mode','graph')

y_hat = predict(t, x);
cm = confusionmat(y,y_hat);

tt = prune(t, 'Level',3);
view(tt)

predict(tt, [33 4 78 NaN])

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

文章

评论

25 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

Promise

文章 0 评论 0

qq_lbRlsh

文章 0 评论 0

待＂谢繁草

文章 0 评论 0

yy2010hell

文章 0 评论 0

漫无边际

文章 0 评论 0

傲娇萝莉攻

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文