多元决策树学习器
确实存在许多单变量决策树学习器实现(C4.5等),但实际上有人知道多变量决策树学习器算法吗?
A lot univariate decision tree learner implementations (C4.5 etc) do exist, but does actually someone know multivariate decision tree learner algorithms?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Bennett 和 Blue 的决策树支持向量机方法通过使用嵌入式 SVM 进行多变量分割对于树中的每个决策。
类似地,在Multicategoryclassificationviadiscretesupportvectormachines(2009)中,Orsenigo和Vercellis将离散支持向量机 (DSVM) 的多类别变体嵌入到决策树节点中。
Bennett and Blue's A Support Vector Machine Approach to Decision Trees does multivariate splits by using embedded SVMs for each decision in the tree.
Similarly, in Multicategory classification via discrete support vector machines (2009) , Orsenigo and Vercellis embed a multicategory variant of discrete support vector machines (DSVM) into the decision tree nodes.
CART算法的决策树可以做成Multivariate。 CART 是一种二进制分割算法,与 C4.5 不同,C4.5 为离散值的每个唯一值创建一个节点。他们对 MARS 也使用与缺失值相同的算法。
要创建多变量树,您需要计算每个节点的最佳分割,但不要丢弃所有不是最佳的分割,而是取其中的一部分(可能是全部),然后通过每个潜在的值评估所有数据的属性。在该节点处按顺序进行加权的分裂。因此,第一个分割(导致最大增益)的权重为 1。然后,下一个最高增益分割的权重为某个小于 1 的分数。 1.0,等等。随着分割增益的减少,权重也随之减少。然后将该数字与左节点内的节点的相同计算进行比较,如果该数字高于该数字则向左移动。否则向右走。这是非常粗略的描述,但这是决策树的多变体分割。
CART algorithm for decisions tree can be made into a Multivariate. CART is a binary splitting algorithm as opposed to C4.5 which creates a node per unique value for discrete values. They use the same algorithm for MARS as for missing values too.
To create a Multivariant tree you compute the best split at each node, but instead of throwing away all splits that weren't the best you take a portion of those (maybe all), then evaluate all of the data's attributes by each of the potential splits at that node weighted by the order. So the first split (which lead to the maximum gain) is weighted at 1. Then the next highest gain split is weighted by some fraction < 1.0, and so on. Where the weights decrease as the gain of that split decreases. That number is then compared to same calculation of the nodes within the left node if it's above that number go left. Otherwise go right. That's pretty rough description, but that's a multi-variant split for decision trees.
是的,有一些,例如 OC1,但它们不如进行单变量分裂的常见。添加多变量分割极大地扩展了搜索空间。作为一种妥协,我见过一些逻辑学习器简单地计算线性判别函数并将它们添加到候选变量列表中。
Yes, there are some, such as OC1, but they are less common than ones which make univariate splits. Adding multivariate splits expands the search space enormously. As a sort of compromise, I have seen some logical learners which simply calculate linear discriminant functions and add them to the candidate variable list.