sklearn.ensemble.randomforestclassifier,具有不一致的输出

发布于 2025-01-30 00:46:48 字数 514 浏览 2 评论 0原文

我有一个训练有素的Sklearn Randomforest多标签分类器,在培训集中,总是存在一个类,这意味着您希望分类器始终返回此类1。发生这种情况,但是分类器返回[1]而不是[0,1]。请参阅输出bellow:

[array([[0.05, 0.95]]), array([[0.97, 0.03]]), 
array([[0.95, 0.05]]), array([[1., 0.]]), array([[1., 0.]]), 
array([[1., 0.]]), array([[0.65, 0.35]]), array([[1.]])]

为什么是这种情况,我该如何防止这种情况发生?我转变为类预测的输入。如果其中一个数组仅具有一个单个维度:[1]而不是二维[0,1]像其他类的预测一样,则不可能。

可以通过Sklearn中的设置进行更改?

额外澄清为什么我只有一个带有正面样本的培训套件:这是推荐系统的一部分,有时每次都会每次购买产品顾客。

I have a trained sklearn randomforest multi-label classifier, in the training set, one class is always present, which means you expect the classifier to always return 1 for this class. This happens, however the classifier returns [1] instead of [0, 1]. See output bellow:

[array([[0.05, 0.95]]), array([[0.97, 0.03]]), 
array([[0.95, 0.05]]), array([[1., 0.]]), array([[1., 0.]]), 
array([[1., 0.]]), array([[0.65, 0.35]]), array([[1.]])]

Why is this the case, and how do I prevent this from happening? In the example, it is the result of only a single input however in my case I have a full data frame as input which I transform into class predictions. This is not possible if one of the arrays has only a single dimension: [1] instead of two dimensions [0,1] like the predictions for the other classes.

Can this be changed with a setting in sklearn?

Extra clarification why I have a training set with only positive class samples: This is part of a recommender system and sometimes a product is bought every time by every type of customer.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

呆° 2025-02-06 00:46:48

我使用简单的列表理解检查解决了它,该检查将第二列添加到不一致的输出数组中。这样做的代码是波纹管,其中rfc_output是存在不一致的列的随机森林输出。

rfc_output = [np.c_[x, np.zeros(window_size)] if len(x[1])<2 else x for x in rfc_output ]

I solved it using a simple list comprehension check that adds a second column to the inconsistent output array. The code to do this is bellow where rfc_output is the random forest output where there are inconsistent columns present.

rfc_output = [np.c_[x, np.zeros(window_size)] if len(x[1])<2 else x for x in rfc_output ]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文