当前位置：文江博客话题详情

Python nlp nltk machine-learning

Ntlk& Python，绘制ROC曲线

发布于 2024-12-16 18:57:21 字数 130 浏览 0 评论 0原文

我正在使用 nltk 和 Python，我想绘制分类器（朴素贝叶斯）的 ROC 曲线。是否有任何函数可以绘制它，或者我应该跟踪真阳性率和假阳性率？

如果有人能指出我已经在做的一些代码，那就太好了......

谢谢。

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（1）

李白 2024-12-23 18:57:21

PyROC 看起来很简单：教程，源代码

这就是它与 NLTK 的配合方式朴素贝叶斯分类器：

# class labels are 0 and 1
labeled_data = [
    (1, featureset_1),
    (0, featureset_2),
    (1, featureset_3),
    # ...
]

# naive_bayes is your already trained classifier,
# preferrably not on the data you're testing on :)

from pyroc import ROCData

roc_data = ROCData(
    (label, naive_bayes.prob_classify(featureset).prob(1))
    for label, featureset
    in labeled_data
)
roc_data.plot()

编辑：

ROC 仅适用于二元分类器。如果您有三个类别，您可以分别衡量正类别和负类别的表现（通过将其他两个类别计为 0，就像您建议的那样）。
该库期望决策函数的输出作为每个元组的第二个值。然后它尝试所有可能的阈值，例如 f(x) >= 0.8 =>分类为 1，并为每个阈值绘制一个点（这就是最终得到一条曲线的原因）。因此，如果您的分类器猜测类别 0，您实际上需要一个接近于零的值。这就是我提出 .prob(1) 的原因

PyROC looks simple enough: tutorial, source code

This is how it would work with the NLTK naive bayes classifier:

# class labels are 0 and 1
labeled_data = [
    (1, featureset_1),
    (0, featureset_2),
    (1, featureset_3),
    # ...
]

# naive_bayes is your already trained classifier,
# preferrably not on the data you're testing on :)

from pyroc import ROCData

roc_data = ROCData(
    (label, naive_bayes.prob_classify(featureset).prob(1))
    for label, featureset
    in labeled_data
)
roc_data.plot()

Edits:

ROC is for binary classifiers only. If you have three classes, you can measure the performance of your positive and negative class separately (by counting the other two classes as 0, like you proposed).
The library expects the output of a decision function as the second value of each tuple. It then tries all possible thresholds, e.g. f(x) >= 0.8 => classify as 1, and plots a point for each threshold (that's why you get a curve in the end). So if your classifier guesses class 0, you actually want a value closer to zero. That's why I proposed .prob(1)

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

0 文章

0 评论

24 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

马化腾

文章 0 评论 0

thousandcents

文章 0 评论 0

辰『辰』

文章 0 评论 0

ailin001

文章 0 评论 0

再摆5分钟就干活

文章 0 评论 0

冷情妓

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文