sklearn.ensemble.randomforestclassifier,具有不一致的输出
我有一个训练有素的Sklearn Randomforest多标签分类器,在培训集中,总是存在一个类,这意味着您希望分类器始终返回此类1。发生这种情况,但是分类器返回[1]而不是[0,1]。请参阅输出bellow:
[array([[0.05, 0.95]]), array([[0.97, 0.03]]),
array([[0.95, 0.05]]), array([[1., 0.]]), array([[1., 0.]]),
array([[1., 0.]]), array([[0.65, 0.35]]), array([[1.]])]
为什么是这种情况,我该如何防止这种情况发生?我转变为类预测的输入。如果其中一个数组仅具有一个单个维度:[1]而不是二维[0,1]像其他类的预测一样,则不可能。
可以通过Sklearn中的设置进行更改?
额外澄清为什么我只有一个带有正面样本的培训套件:这是推荐系统的一部分,有时每次都会每次购买产品顾客。
I have a trained sklearn randomforest multi-label classifier, in the training set, one class is always present, which means you expect the classifier to always return 1 for this class. This happens, however the classifier returns [1] instead of [0, 1]. See output bellow:
[array([[0.05, 0.95]]), array([[0.97, 0.03]]),
array([[0.95, 0.05]]), array([[1., 0.]]), array([[1., 0.]]),
array([[1., 0.]]), array([[0.65, 0.35]]), array([[1.]])]
Why is this the case, and how do I prevent this from happening? In the example, it is the result of only a single input however in my case I have a full data frame as input which I transform into class predictions. This is not possible if one of the arrays has only a single dimension: [1] instead of two dimensions [0,1] like the predictions for the other classes.
Can this be changed with a setting in sklearn?
Extra clarification why I have a training set with only positive class samples: This is part of a recommender system and sometimes a product is bought every time by every type of customer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我使用简单的列表理解检查解决了它,该检查将第二列添加到不一致的输出数组中。这样做的代码是波纹管,其中rfc_output是存在不一致的列的随机森林输出。
I solved it using a simple list comprehension check that adds a second column to the inconsistent output array. The code to do this is bellow where rfc_output is the random forest output where there are inconsistent columns present.