Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 10 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(6)
scikit-learn 有一个 高斯朴素贝叶斯分类器的实现。一般来说,该库的目标是在易于阅读和使用的代码与效率之间提供良好的权衡。希望它应该是一个学习算法工作的好库。
The scikit-learn has an implementation of Gaussian naive Bayesian classifier. In general, the goal of this library is to provide a good trade off between code that is easy to read and use, and efficiency. Hopefully it should be a good library to learn of the algorithms work.
这可能是一个不错的起点。它是朴素贝叶斯分类器的 Python 实现的完整源代码(文本解析器、数据存储和分类器)。尽管它已经完成,但它仍然小到足以在一节课中消化。我认为代码写得相当好并且注释也很好。这是集体智能编程一书的源代码文件的一部分。
要获取源代码,请单击链接,dl 并解压 zip,从主文件夹“PCI_Code”进入文件夹“chapter 6”,其中包含 python 源文件“docclass.py”。这是贝叶斯垃圾邮件过滤器的完整源代码。训练数据(电子邮件)保存在 sqlite 数据库中,该数据库也包含在同一文件夹(“test.db”)中。您需要的唯一外部库是与 sqlite 的 python 绑定(pysqlite);如果您还没有安装 sqlite 本身,则还需要它)。
This might be a good place to start. It's the full source code (the text parser, the data storage, and the classifier) for a python implementation of of a naive Bayesian classifier. Although it's complete, it's still small enough to digest in one session. I think the code is reasonably well written and well commented. This is part of the source code files for the book Programming Collective Intelligence.
To get the source, click the link, dl and unpack the zip, from the main folder 'PCI_Code', go to the folder 'chapter 6', which has a python source file 'docclass.py. That's the complete source code for a Bayesian spam filter. The training data (emails) are persisted in an sqlite database which is also included in the same folder ('test.db') The only external library you need are the python bindings to sqlite (pysqlite); you also need sqlite itself if you don't already have it installed).
如果您正在处理自然语言,请查看自然语言工具包。
如果您正在寻找其他内容,这里有一个简单的搜索在 PyPI 上。
pebl
似乎可以处理连续变量。If you're processing natural language, check out the Natural Language Toolkit.
If you're looking for something else, here's a simple search on PyPI.
pebl
appears to handle continuous variables.我发现 Divmod Reverend 是最简单、最容易使用的使用Python贝叶斯分类器。
I found Divmod Reverend to be the simplest and easiest to use Python Bayesian classifier.
我刚刚将 Paul Graham 的 LISP 内容转换为 Python
http://www.paulgraham.com/spam.html
I just took Paul Graham's LISP stuff and converted to to Python
http://www.paulgraham.com/spam.html
还有 SpamBayes,我认为它可以用作一般的朴素贝叶斯分类器,而不仅仅是垃圾邮件。
There’s also SpamBayes, which I think can be used as a general naive Bayesian clasisfier, instead of just for spam.