混淆矩阵解释数据完美平衡

发布于 2025-02-12 11:58:16 字数 1033 浏览 3 评论 0原文

我已经培训了一个基于变压器的分类器,其中有2类(0,1)在完美平衡的数据集上达到91%的精度。在调谐阈值之后,我在验证数据上打印了混乱矩阵,这些结果是结果,但它们是完全平衡的。您认为有意义吗?

09:29:30 root INFO:*** EVALUATION ON VALIDATION DATA ***
09:29:30 root INFO:AUC: 0.9708
09:29:30 root INFO:Tuned Threshold: 0.3104
09:29:31 root INFO:Matthews Correlation Coefficient computed after applying the tuned/selected threshold : 0.8230210619188743
09:29:31 root INFO:Accuracy: 91.15%
09:29:32 root INFO:--Classification report for VAL DATA--
09:29:32 root INFO:              precision    recall  f1-score   support

          0       0.91      0.91      0.91     88406
          1       0.91      0.91      0.91     88406

   accuracy                           0.91    176812
  macro avg       0.91      0.91      0.91    176812
weighted avg       0.91      0.91      0.91    176812

        pred:0  pred:1
true:0   80583    7823
true:1    7823   80583

感谢您的建议。

更新:

使用相同阈值测试集的混淆矩阵:

        pred:0  pred:1
true:0   81714    9968
true:1    9612   82070

I have trained a transformer based classifier with 2 classes (0,1) reaching a 91 % accuracy on a perfectly balanced dataset. I printed out the confusion matrix on validation data after had tuned the threshold on them and those are the results but they are perfectly balanced. Makes sense in your opinion?

09:29:30 root INFO:*** EVALUATION ON VALIDATION DATA ***
09:29:30 root INFO:AUC: 0.9708
09:29:30 root INFO:Tuned Threshold: 0.3104
09:29:31 root INFO:Matthews Correlation Coefficient computed after applying the tuned/selected threshold : 0.8230210619188743
09:29:31 root INFO:Accuracy: 91.15%
09:29:32 root INFO:--Classification report for VAL DATA--
09:29:32 root INFO:              precision    recall  f1-score   support

          0       0.91      0.91      0.91     88406
          1       0.91      0.91      0.91     88406

   accuracy                           0.91    176812
  macro avg       0.91      0.91      0.91    176812
weighted avg       0.91      0.91      0.91    176812

        pred:0  pred:1
true:0   80583    7823
true:1    7823   80583

Thanks for the advice.

UPDATE:

confusion matrix on test set using the same threshold:

        pred:0  pred:1
true:0   81714    9968
true:1    9612   82070

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文