Sklearn Randomforest如何预测多标签分类的概率

发布于 2025-02-01 16:17:22 字数 752 浏览 2 评论 0原文

我已经实现了

y_target : [1,    0,  0,  0,  1,    0, 1,    0]
y_predict: [0.98, 0,  0,  0,  0.93, 0, 0.4,  0]

y_target : [0,    0,    0,    1,    1,    0,    0,    1   ]
y_predict: [0.36, 0.08, 0.03, 0.44, 0.68, 0.05, 0.05, 0.03]

模型的性能很好，但是，我不明白如何创建这些概率。对于每个类，它预测类的概率为0或1，而不是计算类之间的概率。因为当您查看y_predict时，总结要大于1。

重要的是要注意的是，我的输入X具有41个功能，输出Y是一个单式编码向量8。我用于模型的设置如下：

rfc = RandomForestClassifier(n_estimators=100,
                      bootstrap=True,
                      max_depth=None,
                      max_features='sqrt',
                      random_state=None,
                      min_weight_fraction_leaf = 0,
                      class_weight='balanced')

我希望有人可以帮助我澄清这一点。

原文

I have implemented random forest from sklearn library to solve a multi-label classification problem. After having fitted the model, the predictions (done with predict_proba of sklearn) compared to the target values in the test set look like this:

y_target : [1,    0,  0,  0,  1,    0, 1,    0]
y_predict: [0.98, 0,  0,  0,  0.93, 0, 0.4,  0]

y_target : [0,    0,    0,    1,    1,    0,    0,    1   ]
y_predict: [0.36, 0.08, 0.03, 0.44, 0.68, 0.05, 0.05, 0.03]

The model performs well, however, I don't understand how these probabilities are created. For each class, it predicts the probability of the class to be 0 or 1 instead of calculating the probability among the classes. Because when you look at y_predict, the summation is much greater than 1. Does it create a random forest for each class separately and then calculate the probability per class with a fraction of the vote?

Important to note is that my input X has 41 features and output Y is a one-hot encoded vector of size 8. The settings I use for the model are shown below:

rfc = RandomForestClassifier(n_estimators=100,
                      bootstrap=True,
                      max_depth=None,
                      max_features='sqrt',
                      random_state=None,
                      min_weight_fraction_leaf = 0,
                      class_weight='balanced')

I hope someone can help me to clarify this.

分享到QQ

分享到微博