在投票比赛中追捕作弊者

发布于 2024-08-23 10:02:15 字数 326 浏览 11 评论 0原文

目前我们正在举办一场比赛,进展非常顺利。不幸的是,我们让所有这些作弊者重新开始工作,他们正在运行自动为其条目投票的脚本。通过手动查看数据库条目,我们已经发现了一些作弊者 - 例如,使用同一浏览器在 70 分钟内获得 5 星级评级。现在,随着用户群的增长,识别他们变得越来越困难。

到目前为止我们所做的:

  1. 我们存储 IP 和浏览器,并在一小时的时间内阻止该组合。饼干对对付这些家伙没有帮助。
  2. 我们还使用了验证码,该验证码已被破解

有谁知道如何使用 PHP 脚本在数据库中找到模式,或者如何更有效地阻止它们?

任何帮助将不胜感激...

Currently we are running a competition which proceeds very well. Unfortunately we have all those cheaters back in business who are running scripts which automatically vote for their entries. We already saw some cheaters by looking at the database entries by hand - 5 Star ratings with same browser exactly all 70 minutes for example. Now as the userbase grows up it gets harder and harder to identify them.

What we do until now:

  1. We store the IP and the browser and block that combination to a one hour timeframe. Cookies won't help against these guys.
  2. We are also using a Captcha, which has been broken

Does anyone know how we could find patterns in our database with a PHP script or how we could block them more efficiently?

Any help would be very appreciated...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(19

瀞厅☆埖开 2024-08-30 10:02:15

直接反馈消除

这更像是一种通用策略,可以与许多其他方法结合使用。不要让垃圾邮件发送者知道他是否成功。

您可以完全隐藏当前结果,仅显示百分比而不显示绝对票数,或者延迟显示票数。

  • 优点:对所有方法都有效
  • 缺点:如果欺诈规模很大,百分比显示和延迟将不会有效

投票标记

也是一种通用策略。如果您有某种理由认为该投票是垃圾邮件发送者发出的,请计算他们的投票并将其标记为无效,并在最后删除无效投票。

  • 优点:有效抵御所有可检测到的垃圾邮件攻击
  • 缺点:投票结果出现偏差,难以设置,误报 验证

使用 验证码。如果您的验证码已损坏,请使用更好的验证码。

  • 优点:适用于所有自动化脚本。
  • 缺点:对 pharygulation

IP 检查

毫无用处 限制 IP 地址可以投票的数量在一个时间跨度内。

  • 优点:可以很好地对抗那些在浏览器中不断按 F5 的随机家伙
  • 。优点:易于实现
  • 缺点:对于使用代理服务器的 Pharyngulation 和精心设计的脚本毫无用处。
  • 缺点:一个 IP 地址有时会映射到许多不同的用户

推荐人检查

如果您假设一个用户映射一个 IP 地址,则可以限制该 IP 地址投票的数量。然而,这种假设通常只适用于私人家庭。

  • 优点:易于实施
  • 优点:在某种程度上可以很好地对抗简单的咽喉语
  • 缺点:很容易通过自动化脚本规避

电子邮件确认

使用电子邮件确认,每封电子邮件只允许一票。手动检查您的数据库,看看他们是否使用一次性电子邮件。

请注意,您可以将 +foo 添加到电子邮件地址中的用户名。 [电子邮件受保护]< a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="2f5a5c4a5d414e424a044940406f4a574e425f434a014c4042">[email protected] 会将邮件发送到同一帐户,所以请记住,在检查是否有人已经投票时。

  • 优点:可以有效对抗简单的垃圾邮件脚本
  • 缺点:更难实现
  • 缺点:有些用户不喜欢它

HTML 表单随机化

随机化选择的顺序。他们可能需要一段时间才能发现这一点。

  • 优点:无论如何都很好
  • 缺点:一旦检测到,很容易绕过

HTTPS

投票伪造的一种方法是从有效的浏览器(如 Firefox)捕获 http 请求并用脚本模仿它,当您使用时,这并不那么容易加密。

  • 优点:无论如何都很好
  • 优点:适合非常简单的脚本
  • 缺点:设置

代理检查

更困难如果垃圾邮件发送者通过代理投票,您可以检查 X-Forwarded-For 标头。

  • 优点:对抗使用代理的更高级脚本有好处
  • 缺点:一些合法用户可能会受到影响

缓存检查

尝试查看客户端是否加载所有未缓存的资源。
许多垃圾邮件机器人不这样做。我从未尝试过这一点,我只知道投票网站通常不会检查这一点。

一个示例是在 html 中嵌入 ,其中 a.gif 是一些 1x1 像素图像。然后,您必须使用 Cache-Control "no-cache, Must-revalidate" 设置请求 GET /a.gif 的 http 标头。您可以使用 .htaccess 文件在 Apache 中设置 http 标头,例如 这个。 (感谢 Jacco)

  • 优点:据我所知,这种方法不常见
  • 缺点:设置起来稍微困难一些

[编辑 2010-09-22]

Evercookie

  • 一个所谓的 evercookie 可用于跟踪基于浏览器的垃圾邮件发送者

Direct feedback elimination

This is more of a general strategy that can be combined with many of the other methods. Don't let the spammer know if he succeeds.

You can either hide the current results altogether, only show percentages without absolute number of votes or delay the display of the votes.

  • Pro: good against all methods
  • Con: if the fraud is massive, percentage display and delay won't be effective

Vote flagging

Also a general strategy. If you have some reason to assume that the vote is by a spammer, count their vote and mark it as invalid and delete the invalid votes at the end.

  • Pro: good against all detectable spam attacks
  • Con: skews the vote, harder to set up, false positives

Captcha

Use a CAPTCHA. If your Captcha is broken, use a better one.

  • Pro: good against all automated scripts.
  • Con: useless against pharygulation

IP checking

Limit the number of votes an IP address can cast in a timespan.

  • Pro: Good against random dudes who constantly hit F5 in their browser
  • Pro: Easy to implement
  • Con: Useless against Pharyngulation and elaborate scripts which use proxy servers.
  • Con: An IP address sometimes maps to many different users

Referrer checking

If you assume that one user maps one IP address, you can limit the number if votes by that IP address. However this assumption usually only holds true for private households.

  • Pro: Easy to implement
  • Pro: Good against simple pharyngulation to some extent
  • Con: Very easy to circumvent by automated scripts

Email Confirmation

Use Email confirmation and only allow one vote per Email. Check your database manually to see if they are using throwaway-emails.

Note that you can add +foo to your username in an email address. [email protected] and [email protected] will both deliver the mail to the same account, so remember that when checking if somebody has already voted.

  • Pro: good against simple spam scripts
  • Con: harder to implement
  • Con: Some users won't like it

HTML Form Randomization

Randomize the order of choices. This might take a while for them to find out.

  • Pro: nice to have anyways
  • Con: once detected, very easy to circumvent

HTTPS

One method of vote faking is to capture the http request from a valid browser like Firefox and mimic it with a script, this doesn't work as easy when you use encryption.

  • Pro: nice to have anyway
  • Pro: good against very simple scripts
  • Con: more difficult to set up

Proxy checking

If the spammer votes via proxy, you can check for the X-Forwarded-For header.

  • Pro: good against more advanced scripts that use proxies
  • Con: some legitimate users can be affected

Cache checking

Try to see if the client loads all the uncached resources.
Many spambots don't do this. I never tried this, I just know that this isn't checked usually by voting sites.

An example would be embedding <img src="a.gif" /> in your html, with a.gif being some 1x1 pixel image. Then you have to set the http header for the request GET /a.gif with Cache-Control "no-cache, must-revalidate". You can set the http headers in Apache with your .htaccess file like this. (thanks Jacco)

  • Pro: uncommon method as far as I know
  • Con: slightly harder to set up

[Edit 2010-09-22]

Evercookie

  • A so-called evercookie can be useful to track browser-based spammers
逆光飞翔i 2024-08-30 10:02:15

您是否尝试过进行浏览器指纹识别?
检查这个来自 EFF 的开源:
https://panopticlick.eff.org/
可用于识别世界上 500-1500 个类似的人(!)。

Have you tried to do browser fingerprinting?
Check this open source from EFF:
https://panopticlick.eff.org/
Could be used to identify one person similar to 500-1500 in the world (!).

忘东忘西忘不掉你 2024-08-30 10:02:15

您可以在投票表中添加验证码。还需要电子邮件确认将会很有用

You may add captcha to voting form. Also requiring e-mail confirmation will be useful

谎言 2024-08-30 10:02:15

如果您真的很担心,那么您必须执行电子邮件验证之类的操作,这可能足以阻止大多数作弊者。

它还取决于 NAT 背后的多个人是否可能想要投票给同一个选项(例如最喜欢的学校)。

您创建的任何方案都可以被利用。

编辑:正如其他人所建议的,您可以使用 CAPTCHA 例如 reCAPTCHA 阻止自动机器人,并降低人类重复投票的可能性。代价是降低人类投票的可能性。

If you're really worried about it then you have to do something like email verification, which might be sufficient to block most cheaters.

Also it depends whether multiple people behind a NAT are likely to want to vote for the same option (e.g. favourite school).

Any scheme you create can be gamed.

EDIT: As everyone else has suggested, you can use a CAPTCHA such as reCAPTCHA to block automated bots, and make humans less likely to repeat vote. At the cost of making humans less likely to vote at all.

并安 2024-08-30 10:02:15

投票推广模式(您可能已经意识到了) )有一节介绍如何减轻游戏影响 - 但这是一个很难完全避免的部分。鉴于您迄今为止的行动,我会考虑使用加权,例如考虑一段时间内的合理投票水平,例如每小时每个投票 10 票(只是一个示例,而不是指南),对于剩余投票,将接下来的 10 票权重为 90% (即只数 9 个),接下来的 10 个占 80%,依此类推。这是雅虎对于这种模式下的游戏的建议:

社区投票系统确实提出了
挑战的数量。特别是
成员的可能性
社区可能会尝试玩弄系统,
出于多种动机:

  • 恶意 - 可能针对社区的另一名成员,并且
    成员的贡献。

  • 增益 - 从金钱或其他方面实现一些奖励
    影响某些物品的放置
    池中的项目)

  • 或总体议程 - 始终宣扬某些观点或
    政治言论,很少
    考虑到实际质量
    正在投票的内容。

有多种方法可以尝试
为了防止这种类型的
虐待。虽然没有什么可以阻止游戏
共。这里有一些方法
最大限度地减少或阻碍施虐者的行为
努力:

  • 投票给事物,而不是人。为了与雅虎的总体战略保持一致,
    不向用户提供以下能力:
    直接投票给另一个用户:他们的
    外表、他们的可爱程度、
    智力,或者其他什么。它是
    可以让社区投票
    个人的贡献,但不是
    他们的性格品质。

    • 考虑投票率限制。
      o 只允许用户在给定的时间内投票数
      时间段。
      o 限制用户投票的次数(或投票率)
      记录特定用户的内容。 (到
      防止人身攻击。)

    • 除了投票数之外,还要权衡其他因素。挖掘,为了
      例如,不计算他们的
      Digg-score 仅根据数量
      提交收到的投票。他们的
      算法还考虑:“故事
      来源(是博客转发,还是
      原创故事)、用户历史、流量
      故事所属类别的级别
      下和用户报告。”他们更新
      这个算法经常出现。考虑
      将确切的算法保密
      来自社区,或仅讨论
      一般而言,因子输入。

  • 如果关系信息可用,请考虑对用户进行加权
    进行相应的投票。也许禁止
    具有正式关系的用户
    对彼此提交的内容进行投票。

虽然这是目前流行的
网络上的模式,重要的是
考虑我们使用的上下文
它。非常活跃且受欢迎
社区(Digg 是一个优秀的
示例)支持社区投票
也会产生一定的消极情绪
精神(刻薄的评论,固执己见
派系,对“异常值”的群体攻击
观点)。

The Vote to Promote pattern (you may be aware of it) has a section on how to mitigate against gaming - but it is a tricky one to avoid altogether. Given your actions to date I would consider using weighting, for example consider a reasonable level of voting over a time period, say 10 votes per ting per hour (just an example not a guide) and for surplus votes weight the next 10 at 90% (ie only count 9), the next 10 at 80% and so on. This is Yahoo's advice on gaming within this pattern:

Community voting systems do present a
number of challenges. Particularly the
possibility that members of the
community may try to game the system,
out of any number of motivations:

  • malice - perhaps against another member of the community and that
    member's contributions.

  • gain - to realize some reward, monetary or otherwise, from
    influencing the placement of certain
    items in the pool)

  • or an overarching agenda - always promoting certain viewpoints or
    political statements, with little
    regard for the actual quality of the
    content being voted for.

There are a number of ways to attempt
to safeguard against this type of
abuse. Though nothing can stop gaming
altogether. Here are some ways to
minimize or hinder abusers in their
efforts:

  • Vote for things, not people. In keeping with Yahoo's general strategy,
    don't offer users the ability to
    directly vote on another user: their
    looks, their likeability,
    intelligence, or anything else. It's
    OK for the community to vote on a
    person's contributions, but not on the
    quality of their character.

    • Consider rate-limiting of votes.
      o Only allow the user a certain number of votes within a given
      time-period.
      o Limit the number of times (or the rate at which) a user votes
      down a particular user's content. (To
      prevent ad-hominem attacks.)

    • Weigh other factors besides just the number of votes. Digg, for
      instance, does not calculate their
      Digg-score solely on the number of
      votes a submission receives. Their
      algorithm also considers: "story
      source (is it a blog repost, or the
      original story), user history, traffic
      levels of the category the story falls
      under, and user reports." They update
      this algorithm frequently. Consider
      keeping the exact algorithm a secret
      from the community, or only discuss
      the factored inputs in general terms.

  • If relationship information is available consider weighting user
    votes accordingly. Perhaps prohibit
    users with formal relationships from
    voting for each other's submissions.

While this is currently a popular
pattern on the Web, it is important to
consider the contexts in which we use
it. Very active and popular
communities (Digg is an excellent
example) that enable community-voting
can also engender a certain negativity
of spirit (mean comments, opinionated
cliques, group attacks on 'outlier'
viewpoints).

水波映月 2024-08-30 10:02:15

查看 Asirra:http://research.microsoft.com/en -us/um/redmond/projects/asirra/
它仍处于测试阶段,但非常酷。

Check out Asirra: http://research.microsoft.com/en-us/um/redmond/projects/asirra/
It's still in beta, but it's pretty cool.

鹤舞 2024-08-30 10:02:15

要防止机器人投票,您可以使用 CAPTCHA

To prevent the bots from voting you can use CAPTCHA.

迷雾森÷林ヴ 2024-08-30 10:02:15

我唯一想到的是使用 验证码。要么是带有图片和噪音的复杂服务,例如 ReCaptcha 服务,要么是非常简单且简单的服务不引人注意的问题,例如“七加三等于多少?”或者(如果您位于美国)“我们总统的姓氏是什么”,每个人都可以回答的简单常识问题。如果您经常更改它们,这甚至可能比经典的基于图像的验证码更有效。

The only thing that comes to mind is using a Captcha. Either an elaborate one with pictures and noise like the ReCaptcha service, or a very simple and unobtrusive one like "What is seven plus three?" or (If you're located in the US), "What is the last name of our President", simple common sense questions everybody can answer. If you change them often enough, this could even be more effective than a classic image-based CAPTCHA.

乱世争霸 2024-08-30 10:02:15

验证码不是灵丹妙药,用户可以让脚本向他们显示验证码,并手动解决它们,每分钟至少进行几次投票。

您需要将它们与此处提到的其他技术结合使用。

CAPTCHA's aren't a silver bullet, the user could have their script display the CAPTCHA to them and solve them manually for at least several votes per minute.

You need to use them in combination with other techniques mentioned here.

孤独患者 2024-08-30 10:02:15

您可以像在 Django 中一样添加 蜜罐字段。最有可能的是,这并不能保护您免受那些故意想要改变您的竞争对手的作弊者的侵害,但至少您将有更少的“路过”垃圾邮件发送者需要额外处理。

You could add a honeypot field like in Django. Most likely, this will not protect you from cheaters who deliberately want to change your competition, but at least you will have lesser 'drive-by' spammers to additionally take care of.

他夏了夏天 2024-08-30 10:02:15

抱歉,我发了两个帖子,但我不允许在同一篇帖子中发布两个网址...

如果您正在考虑构建自己的跟踪,也许此链接可能会提供一些灵感:https://panopticlick.eff.org/
事实证明,即使没有任何形式的跟踪 cookie,许多浏览器也可以被唯一地识别。我猜投票机器人可能会给出非常具体的指纹?

Sorry for the double post, but I wasn't allowed to post two URLs in the same post...

If you're looking at building your own tracking, maybe this link might provide some inspiration: https://panopticlick.eff.org/
Turns out that a lot of browsers can be uniquely identified, even without any form of tracking cookies. I'm guessing a vote-bot might give a very specific fingerprint?

夢归不見 2024-08-30 10:02:15

因此,如果每个人都想举办一场可以赢得某些东西的比赛,并想使用社区驱动的评级系统……我在这里分享一些经验:

坏处:
1) 首先它不能保证100%安全
2)要接触大量用户并过滤掉所有无意义的评分是非常困难的
3)在这种情况下忘记星级评定...他们总是 5 星或 1 星

好的
1) 不要让他们了解自己的立场...我们用前 100 名的随机呈现取代了“按地点排序”视图(只有前 30 名才能赢得价格)...这确实很有帮助,因为很多用户一旦看不到自己所处的位置就失去了兴趣。

2) 不允许投票,例如:1x5_Stars 40x1_Star...只允许以公平方式投票的用户...

3) 他们中的大多数行为有点愚蠢...您会在日志中看到它们并可以跟踪记录谁投票公平,谁投票不公平...搜索模式...

**祝你好运;-) **

So if everyone ever wants to make a competition where people can win something and wanna use a community driven rating system... here i share some experiences:

The bad:
1) First it cant be made secure for 100%
2) to reach a mass of users which filters out all the nonsense ratings is very hard
3) Forget about star ratings in that case... their is always either 5 Stars or 1 Star

The good
1) Dont give them orientation about where they stand... We replaced the "Order by place" view with a random presentation of the TOP 100 (only the top 30 wll win a price)... This really helped because a lot of users lost their interest as soon as they didnt see where they stood.

2) Don't allow votings like: 1x5_Stars 40x1_Star... Just allow users which vote in a fair way...

3) Most of them act a little bit stupid... You'll see them in your logs and can trace down who votes fair and who unfair... Search for patterns...

**GOOD LUCK ;-) **

野却迷人 2024-08-30 10:02:15

CAPTCHA 总是好的,但对于某些用户来说可能会“令人不安”。

reCAPTCHA 是一项相当常用的服务

CAPTCHA is always good, might be "disturbing" for some users though.

reCAPTCHA is a fairly used service

渔村楼浪 2024-08-30 10:02:15

只允许在提交投票之前使用 openidreCaptcha 登录的用户,并使用相同 ip 地址监控提交者列表。

How about only allow users who logged in with openid and with reCaptcha before submitting the vote, and monitering the submitter list with same ip address.

深府石板幽径 2024-08-30 10:02:15

我们结合使用验证码和电子邮件。用户通过邮件收到带有 GUID 的链接。
对于每个尝试投票的用户来说,该值必须是唯一的。
www.votesite.com/vote.aspx?guid=.....
通过使用此链接,投票可以确认或未确认。在数据库中,我们检查电子邮件地址和 GUID 的组合是否唯一。

We use a combination of CAPTCHA and email. The user receive a link with a GUID by mail.
This one must be unique for each user that try to vote.
www.votesite.com/vote.aspx?guid=.....
By using this link the vote is confirmed or not. In database we check the combination of email address and GUID to be unique.

流星番茄 2024-08-30 10:02:15

我使用了CAPTCHA、IP验证和LSO(Flash本地共享对象,普通人很难找到和删除)的组合。

I use a combination of CAPTCHA, IP verification and LSO (Flash Local Shared Objects, hard to find and delete for common people).

偷得浮生 2024-08-30 10:02:15

1.使用recaptcha

2。是的,随机您的投票选项,但不是这样:
      ->从 vote_id_1 到 asdsasd_1、grdsgsdg_2、
     而是使用会话变量在投票表单中设置从 vote_id_1 到 asgjdas87th2ad 的掩码。

1.Use recaptcha

2. Yes randomize your voting options but not like this:
      -> from vote_id_1 to asdsasd_1, grdsgsdg_2,
      Instead use session variables to set a mask from vote_id_1 to asgjdas87th2ad in the vote form.

静若繁花 2024-08-30 10:02:15

一些事后随机分析怎么样,例如时间序列分析 - 寻找特定(ip、浏览器、投票)事件的周期性?然后,您可以为每个此类事件组分配属于 1 个人的概率,并丢弃超出某个概率级别的所有此类事件组,或者使用某种加权根据概率降低权重。

看看R,它包含很多有用的分析包。

What about some post hoc stochastic analysis, like time series analysis - looking for periodicity in events of particular (ip, browser, vote)? You could then assign probability to each such group of events that it belongs to 1 person and either discard all such groups of events beyond some probability level, or use some kind of weighting to lower the weight according to the probability.

Look in R, it contains A LOT of useful analysis packages.

情未る 2024-08-30 10:02:15

检查他们正在使用的电子邮件的域详细信息。我遇到了同样的问题,发现它们都注册到同一个注册者。我写在这里: http://tincan.co.uk/659/news /competition-spammers.html

现在,我过滤注册中使用的电子邮件的 DNS 信息。

Check the domain details of the email they are using. I had the same problem and found that all of them were registered to the same registrant. I wrote it up here: http://tincan.co.uk/659/news/competition-spammers.html

Now, I filter on the DNS information for the email used in the registration.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文