Sklearn Randomforest如何预测多标签分类的概率
我已经实现了
y_target : [1, 0, 0, 0, 1, 0, 1, 0]
y_predict: [0.98, 0, 0, 0, 0.93, 0, 0.4, 0]
y_target : [0, 0, 0, 1, 1, 0, 0, 1 ]
y_predict: [0.36, 0.08, 0.03, 0.44, 0.68, 0.05, 0.05, 0.03]
模型的性能很好,但是,我不明白如何创建这些概率。对于每个类,它预测类的概率为0或1,而不是计算类之间的概率。因为当您查看y_predict时,总结要大于1。
重要的是要注意的是,我的输入X具有41个功能,输出Y是一个单式编码向量8。我用于模型的设置如下:
rfc = RandomForestClassifier(n_estimators=100,
bootstrap=True,
max_depth=None,
max_features='sqrt',
random_state=None,
min_weight_fraction_leaf = 0,
class_weight='balanced')
我希望有人可以帮助我澄清这一点。
I have implemented random forest from sklearn library to solve a multi-label classification problem. After having fitted the model, the predictions (done with predict_proba of sklearn) compared to the target values in the test set look like this:
y_target : [1, 0, 0, 0, 1, 0, 1, 0]
y_predict: [0.98, 0, 0, 0, 0.93, 0, 0.4, 0]
y_target : [0, 0, 0, 1, 1, 0, 0, 1 ]
y_predict: [0.36, 0.08, 0.03, 0.44, 0.68, 0.05, 0.05, 0.03]
The model performs well, however, I don't understand how these probabilities are created. For each class, it predicts the probability of the class to be 0 or 1 instead of calculating the probability among the classes. Because when you look at y_predict, the summation is much greater than 1. Does it create a random forest for each class separately and then calculate the probability per class with a fraction of the vote?
Important to note is that my input X has 41 features and output Y is a one-hot encoded vector of size 8. The settings I use for the model are shown below:
rfc = RandomForestClassifier(n_estimators=100,
bootstrap=True,
max_depth=None,
max_features='sqrt',
random_state=None,
min_weight_fraction_leaf = 0,
class_weight='balanced')
I hope someone can help me to clarify this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论