使用 MySQL 检测垃圾邮件发送者
我发现越来越多的用户在我的网站上注册,只是为了向其他用户发送重复的垃圾邮件消息。我添加了一些服务器端代码来使用以下 mysql 查询检测重复消息:
SELECT count(content) as msgs_sent
FROM messages
WHERE sender_id = '.$sender_id.'
GROUP BY content having count(content) > 10
该查询运行良好,但现在他们通过更改消息中的一些字符来解决此问题。有没有办法用 MySQL 检测到这一点,或者我是否需要查看从 MySQL 返回的每个分组,然后使用 PHP 来确定相似性百分比?
有什么想法或建议吗?
I see an ever increasing number of users signing up on my site to just send duplicate SPAM messages to other users. I've added some server side code to detect duplicate messages with the following mysql query:
SELECT count(content) as msgs_sent
FROM messages
WHERE sender_id = '.$sender_id.'
GROUP BY content having count(content) > 10
The query works well but now they're getting around this by changing a few charctersr in their messages. Is there a way to detect this with MySQL or do I need to look at each grouping returned from MySQL and then use PHP to determine the percentage of similarity?
Any thoughts or suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
全文匹配
您可以考虑实现类似于
MATCH
示例的内容此处:对于您的示例,也许:
请注意,要使用这些函数,您的
content
列需要是FULLTEXT
指数。本例中的
分数
是什么?它是一个
相关性值
。它是通过以下描述的过程计算的:来自文档页面。
Fulltext Match
You could look at implementing something similar to the
MATCH
example here:So for your example, perhaps:
Note that to use these functions your
content
column would need to be aFULLTEXT
index.What is
score
in this example?It is a
relevance value
. It is computed through the process described below:From the documentation page.