We don’t allow questions seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. You can edit the question so it can be answered with facts and citations.
Closed 9 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(3)
模式比文字更重要(除了“#FF0000”,您几乎可以 100% 确信以 HTML 邮件形式发送的任何内容中包含“#FF0000”的内容都不值得阅读)。查看 http://en.wikipedia.org/wiki/Bayesian_spam_filtering 及其引用有一种方法(IIRC,第一个技术实验发现,在训练过滤器后,“#FF0000”是最有可能指示垃圾邮件的表达方式,请参阅我告诉过你的)。
Patterns are more important than words (barring "#FF0000", you can be pretty much 100% confident that anything sent as an HTML mail will "#FF0000" in it is not worth reading). Take a look at http://en.wikipedia.org/wiki/Bayesian_spam_filtering and the references it has for one approach (IIRC, one of the first experiments with technique found that after training the filter, "#FF0000" was the expression most likely to indicate Spam, see I told you so).
您可能想查看 Paul Graham 垃圾邮件过滤文章 您还可以查看 使用朴素贝叶斯分类器的垃圾邮件过滤器的 C# 实现
You may want to check out Paul Graham spam filtering article You can also have a look at C# implementation of spam filter using Naive Bayes Classifier
这是基于单词黑名单的简单手工垃圾邮件过滤器: 基于黑名单的垃圾邮件过滤器的 LINQ 查询
当您没有动力添加库或复杂的自定义解决方案时,此解决方案适用。
Here is the trivial hand-made spam filter based on word blacklist: LINQ Query for Blacklist-Based Spam Filter
This solution is applicable when you don't find motivation to add libraries or complicated custom solution.