如何解释Predive_proba?

发布于 2025-02-10 01:07:29 字数 807 浏览 1 评论 0原文

我正在学习员工的离职预测,我得到了下面的结果。 如果我看到第一行,我可能会解释为这位员工将使公司跌83%。 我能正确理解吗?

    Output exceeds the size limit. Open the full output data in a text editor
array([[0.17, 0.83],
       [0.43, 0.57],
       [0.29, 0.71],
       [0.94, 0.06],
       [0.98, 0.02],
       [0.84, 0.16],
       [0.64, 0.36],
       [1.  , 0.  ],
       [0.85, 0.15],
       [0.99, 0.01],
       [0.09, 0.91],
       [0.89, 0.11],
       [0.21, 0.79],
       [0.15, 0.85],
       [0.78, 0.22],
       [0.18, 0.82],
       [0.84, 0.16],
       [0.45, 0.55],
       [0.96, 0.04],
       [0.95, 0.05],
       [0.91, 0.09],
       [0.9 , 0.1 ],
       [1.  , 0.  ],
       [0.91, 0.09],
       [0.74, 0.26],
...
       [0.94, 0.06],
       [0.99, 0.01],
       [0.22, 0.78],
       [0.89, 0.11],
       [0.98, 0.02]])

I'm learning Employee Turnover Forecast and I got the result as below from predict_proba
If I see the first row, I may interpret as this employee would left the company by 83%.
Do I understand this correctly?

    Output exceeds the size limit. Open the full output data in a text editor
array([[0.17, 0.83],
       [0.43, 0.57],
       [0.29, 0.71],
       [0.94, 0.06],
       [0.98, 0.02],
       [0.84, 0.16],
       [0.64, 0.36],
       [1.  , 0.  ],
       [0.85, 0.15],
       [0.99, 0.01],
       [0.09, 0.91],
       [0.89, 0.11],
       [0.21, 0.79],
       [0.15, 0.85],
       [0.78, 0.22],
       [0.18, 0.82],
       [0.84, 0.16],
       [0.45, 0.55],
       [0.96, 0.04],
       [0.95, 0.05],
       [0.91, 0.09],
       [0.9 , 0.1 ],
       [1.  , 0.  ],
       [0.91, 0.09],
       [0.74, 0.26],
...
       [0.94, 0.06],
       [0.99, 0.01],
       [0.22, 0.78],
       [0.89, 0.11],
       [0.98, 0.02]])

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

生生不灭 2025-02-17 01:07:29

模型得分是对结果的模型确定性的度量。但是,这并不一定与概率相同:这并不意味着83%的人得分为0.83。逻辑回归分数是设计的概率,但是对于随机的森林行为是定义的。如果您寻求将分数直接集成到业务指标中,则需要先校准模型(例如sklearn.calibration.calibratedClassifiercv或iSotonic回归)。

Model score is a measure of the model certainty of the outcome. However, it's not necessarily the same as probability: it does not mean 83% people with 0.83 score leaving yet. Logistic regression scores are probabilities by design, but for random forest behaviour is implementation defined. If you seek to integrate your scores into business metrics directly, you'll need to calibrate your model first (using e.g. sklearn.calibration.CalibratedClassifierCV or isotonic regression).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文