经过训练的垃圾邮件机器学习分类器/模型
我有一个大约 1700 万个句子的列表。我需要将句子识别为垃圾邮件/火腿/不确定。互联网上是否存在经过训练的模型,我可以将数据作为“测试”集输入,系统会将我的句子分类为垃圾邮件/火腿?
注意:这些句子不是电子邮件。
I have a list of sentences about 17 million. I need to identify sentence as spam/ham/unsure. Are there trained models present on the internet to which I could just feed in my data as a "test" set and the system would classify my sentence as spam/ham ?
Note: The sentences aren't e-mails.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用贝叶斯垃圾邮件过滤,看看这篇文章很好地理解了理论 http://robotics.stanford.edu/users/sahami/papers-dir/spam.pdf
You can use the Bayesian Spam Filtering, take a look at this article pretty nice to understand the theory http://robotics.stanford.edu/users/sahami/papers-dir/spam.pdf