朴素贝叶斯垃圾邮件过滤问题
我计划使用朴素贝叶斯分类模型来实现垃圾邮件过滤器。
在网上我看到很多关于朴素贝叶斯分类的信息,但问题是它有很多数学内容,而不是清楚地说明它是如何完成的。 问题是我更像是一名程序员而不是数学家(是的,我在学校学过概率和贝叶斯定理,但很长一段时间都没有接触过,而且我没有奢侈现在就开始学习(有近 3 周的时间来制作一个工作原型))。
因此,如果有人可以解释或指出我向程序员而不是数学家解释的位置,这将是一个很大的帮助。
PS:顺便说一下,如果你想知道的话,我必须用C实现它。 :(
问候, 微内核
I am planning to implement spam filter using Naive Bayesian classification model.
Online I see a lot of info on Naive Bayesian classification, but the problem is its a lot of mathematical stuff, than clearly stating how its done. And the problem is I am more of a programmer than a mathematician (yes I had learnt Probability and Bayesian theorem back in school, but out of touch for a long long time, and I don't have luxury of learning it now (Have nearly 3 weeks to come-up with a working prototype)).
So if someone can explain or point me to location where its explained for programmers than a mathematician, it would be a great help.
PS: By the way I have to implement it in C, if you want to know. :(
Regards,
Microkernel
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
集体智能编程一章介绍了这种方法和其他方法。本章(#6)无需参考前面的章节即可理解,写得很清楚,并且仅讨论完成工作所需的最少数学知识。
The book Programming Collective Intelligence has chapter that covers this and other methods. The chapter (#6) can be understood without reference to previous chapters, is written clearly, and discusses only the minimal mathematics necessary to get the job done.
您可以尝试这个网站。它有一些源代码。
You could try this website. It's got some source code.
我强烈推荐 Andrew Moore 的教程,我认为你应该从 这个。
I would highly recommend Andrew Moore's tutorials and I think you should start with this one.
您还可以查看 POPFile,一个开源垃圾邮件过滤引擎。
You could also take a look at POPFile, an open source spam filter engine.
你看过dspam吗?
http://dspam.irontec.com/faq.shtml#1.0
http://www.nuclearelephant.com/
Have you looked at dspam?
http://dspam.irontec.com/faq.shtml#1.0
http://www.nuclearelephant.com/