如何防止页面浏览量被欺骗?
假设我有一个带有页面的网站。页面根据浏览次数进行排名。页面排名较高是有好处的,因为它会使其在我的搜索结果中显示得更高。因此,页面的作者可能会尝试欺骗系统以增加该特定页面的浏览量。
那么如何在保持准准确计数的同时防止这种情况发生?
我提出了以下“方案”:
用户每次会话只能影响页面视图一次。这是我通常所期望的。如果用户稍后返回网站并再次查看该页面,则应将其视为另一次页面浏览。
问题在于,这使得页面视图增量容易受到在每次请求之前清除其 cookie 的脚本的影响。此问题最简单的解决方案是保存 IP 地址并仅允许相同的 IP 地址增加页数一次。然而,这有几个主要缺点;首先,这可能会占用大量存储空间,其次会阻止大型 LAN 上的用户增加页面计数。最后,用户无法从同一 IP 重新访问页面并多次增加页面视图。我可以接受它,但宁愿没有它。
我能想到的最好方法是保存最后的 X 个 IP 地址,并且不要让这些 IP 地址中的任何人影响页面浏览量。这将有效地阻止任何(简单)脚本提高页面浏览量。此外,为实际观看次数的显示添加延迟可能是一个好主意(基本上保留两个计数和一个日期时间字段,以显示“显示”计数上次使用“实际”更新的时间)计数,我相信这是在 SE 网站上完成的)。
这不是一个完美的解决方案,所以我很高兴听到您的建议和/或意见。
Say I have a site with pages. Pages are ranked based on the number of times they have been viewed. It is good for a page to be highly ranked because it will make it show up higher in my search results. Hence, the author of a page may try to game the system to increase that particular page's views.
So how do you prevent that while still keeping a quasi-accurate count?
I have come up with the following "scheme":
A user can only affect the page view once per session. This is what I would normally expect. If a user returns to the site later and views the page again, it should count as another page view.
The problem is that this makes the page view increment vulnerable to a script that clears its cookies before each request. The easiest solution to this problem would be to save the ip-address and only allow the same ip-address to increment page count once. This however has several major drawbacks; First of all, this would potentially take up a lot of storage, and second of all would prevent users on big LANs from incrementing page count. Lastly, a user cannot revisit a page and increment the page view more than once from the same ip. I can live with that, but would rather live without it.
The best method I can come up with off the top of my head would be to save the last X ip-addresses, and not let anyone from these ip-addresses affect the page view count. This would effectively stop any (simple) script from raising the page view count. Furthermore it would probably be a good idea to add a delay to the display of actual view count (basically keeping two counts and a datetime field for when the "display" count was last updated with the "actual" count, something I believe is done on the SE sites).
This is not a perfect solution, so I would be happy to hear your suggestions and/or comments.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不预防:监控和处理。
我会使用一种非常不同的方法。让页面浏览量保持不变,但有适当的报告来查找观看游戏。如果某个页面被欺骗,您可以找出责任人,给予他们警告和页面浏览量惩罚。如果这种情况持续下去,请禁止它们。
Don't prevent: monitor and handle.
I would use a very different approach. Let the page views stay the same, but have reporting in place to looks for view-gaming. If a page gets gamed, you can find out who is responsible, give them a warning and a page-view penalty. If it continues, ban them.
我认为您还应该考虑浏览器的报告特征。浏览器指纹识别以前已经完成并且广为人知。然后,您可以找出一些非常先进的启发式方法来确定同一用户是否试图欺骗您。但当然不要公开您正在使用浏览器指纹识别。另外,它不会停止隐身模式,但除了您当前的 IP 导向策略之外,我只是想为您提供多一种可遵循的思路。
I think that you should consider the reported characteristics of the browser as well. Browser fingerprinting has been done before and is well publicized. You can then figure out some pretty advanced heuristics on determining whether the same user is trying to game you. But don't publicize that you're using browser fingerprinting of course. Also, it won't stop incognito mode, but I'm just trying to give you one more avenue of thought to follow, in addition to your current IP oriented strategies.