BOT/蜘蛛陷阱创意

发布于 2024-09-25 03:47:25 字数 339 浏览 15 评论 0 原文

我有一个客户，他的域名似乎受到 DDoS 攻击的严重打击。在日志中，看起来很正常的具有随机 IP 的用户代理，但它们翻阅页面的速度太快，不像人类。他们似乎也没有要求任何图像。我似乎找不到任何模式，我怀疑这是一群 Windows 僵尸。

客户过去曾遇到过垃圾邮件攻击的问题，甚至不得不将 MX 指向 Postini，以获取每天 6.7 GB 的垃圾数据来阻止服务器端。

我想在 robots.txt 不允许的目录中设置 BOT 陷阱...只是以前从未尝试过类似的事情，希望有人有一个捕获 BOT 的创意！

编辑：我已经有很多关于捕捉一个的想法......这是当它落入陷阱时该怎么做。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

燕归巢 2024-10-02 03:47:30

好吧，我必须说，有点失望——我希望能有一些创造性的想法。我确实在这里找到了理想的解决方案.. http://www.kloth.net/internet/bottrap。 php

<html>
    <head><title> </title></head>
    <body>
    <p>There is nothing here to see. So what are you doing here ?</p>
    <p><a href="http://your.domain.tld/">Go home.</a></p>
    <?php
      /* whitelist: end processing end exit */
      if (preg_match("/10\.22\.33\.44/",$_SERVER['REMOTE_ADDR'])) { exit; }
      if (preg_match("Super Tool",$_SERVER['HTTP_USER_AGENT'])) { exit; }
      /* end of whitelist */
      $badbot = 0;
      /* scan the blacklist.dat file for addresses of SPAM robots
         to prevent filling it up with duplicates */
      $filename = "../blacklist.dat";
      $fp = fopen($filename, "r") or die ("Error opening file ... <br>\n");
      while ($line = fgets($fp,255)) {
        $u = explode(" ",$line);
        $u0 = $u[0];
        if (preg_match("/$u0/",$_SERVER['REMOTE_ADDR'])) {$badbot++;}
      }
      fclose($fp);
      if ($badbot == 0) { /* we just see a new bad bot not yet listed ! */
      /* send a mail to hostmaster */
        $tmestamp = time();
        $datum = date("Y-m-d (D) H:i:s",$tmestamp);
        $from = "[email protected]";
        $to = "[email protected]";
        $subject = "domain-tld alert: bad robot";
        $msg = "A bad robot hit $_SERVER['REQUEST_URI'] $datum \n";
        $msg .= "address is $_SERVER['REMOTE_ADDR'], agent is $_SERVER['HTTP_USER_AGENT']\n";
        mail($to, $subject, $msg, "From: $from");
      /* append bad bot address data to blacklist log file: */
        $fp = fopen($filename,'a+');
        fwrite($fp,"$_SERVER['REMOTE_ADDR'] - - [$datum] \"$_SERVER['REQUEST_METHOD'] $_SERVER['REQUEST_URI'] $_SERVER['SERVER_PROTOCOL']\" $_SERVER['HTTP_REFERER'] $_SERVER['HTTP_USER_AGENT']\n");
        fclose($fp);
      }
    ?>
    </body>
</html>

然后为了保护页面抛出在每个页面的第一行.. blacklist.php 包含：

<?php
    $badbot = 0;
    /* look for the IP address in the blacklist file */
    $filename = "../blacklist.dat";
    $fp = fopen($filename, "r") or die ("Error opening file ... <br>\n");
    while ($line = fgets($fp,255))  {
      $u = explode(" ",$line);
      $u0 = $u[0];
      if (preg_match("/$u0/",$_SERVER['REMOTE_ADDR'])) {$badbot++;}
    }
    fclose($fp);
    if ($badbot > 0) { /* this is a bad bot, reject it */
      sleep(12);
      print ("<html><head>\n");
      print ("<title>Site unavailable, sorry</title>\n");
      print ("</head><body>\n");
      print ("<center><h1>Welcome ...</h1></center>\n");
      print ("<p><center>Unfortunately, due to abuse, this site is temporarily not available ...</center></p>\n");
      print ("<p><center>If you feel this in error, send a mail to the hostmaster at this site,<br>
             if you are an anti-social ill-behaving SPAM-bot, then just go away.</center></p>\n");
      print ("</body></html>\n");
      exit;
    }
?>

我计划采纳 Scott Chamberlain 的建议，并且为了安全起见，我计划在脚本上实现验证码。如果用户回答正确，那么它就会死亡或重定向回网站根目录。只是为了好玩，我将陷阱放入名为 /admin/ 的目录中，当然还添加了 Disallow: /admin/ 到 robots.txt。

编辑：此外，我将忽略规则的机器人重定向到此页面：http: //www.seastory.us/bot_this.htm

Well I must say, kinda disappointed--I was hoping for some creative ideas. I did find the ideal solutions here.. http://www.kloth.net/internet/bottrap.php

<html>
    <head><title> </title></head>
    <body>
    <p>There is nothing here to see. So what are you doing here ?</p>
    <p><a href="http://your.domain.tld/">Go home.</a></p>
    <?php
      /* whitelist: end processing end exit */
      if (preg_match("/10\.22\.33\.44/",$_SERVER['REMOTE_ADDR'])) { exit; }
      if (preg_match("Super Tool",$_SERVER['HTTP_USER_AGENT'])) { exit; }
      /* end of whitelist */
      $badbot = 0;
      /* scan the blacklist.dat file for addresses of SPAM robots
         to prevent filling it up with duplicates */
      $filename = "../blacklist.dat";
      $fp = fopen($filename, "r") or die ("Error opening file ... <br>\n");
      while ($line = fgets($fp,255)) {
        $u = explode(" ",$line);
        $u0 = $u[0];
        if (preg_match("/$u0/",$_SERVER['REMOTE_ADDR'])) {$badbot++;}
      }
      fclose($fp);
      if ($badbot == 0) { /* we just see a new bad bot not yet listed ! */
      /* send a mail to hostmaster */
        $tmestamp = time();
        $datum = date("Y-m-d (D) H:i:s",$tmestamp);
        $from = "[email protected]";
        $to = "[email protected]";
        $subject = "domain-tld alert: bad robot";
        $msg = "A bad robot hit $_SERVER['REQUEST_URI'] $datum \n";
        $msg .= "address is $_SERVER['REMOTE_ADDR'], agent is $_SERVER['HTTP_USER_AGENT']\n";
        mail($to, $subject, $msg, "From: $from");
      /* append bad bot address data to blacklist log file: */
        $fp = fopen($filename,'a+');
        fwrite($fp,"$_SERVER['REMOTE_ADDR'] - - [$datum] \"$_SERVER['REQUEST_METHOD'] $_SERVER['REQUEST_URI'] $_SERVER['SERVER_PROTOCOL']\" $_SERVER['HTTP_REFERER'] $_SERVER['HTTP_USER_AGENT']\n");
        fclose($fp);
      }
    ?>
    </body>
</html>

Then to protect pages throw <?php include($DOCUMENT_ROOT . "/blacklist.php"); ?> on the first line of every page.. blacklist.php contains:

<?php
    $badbot = 0;
    /* look for the IP address in the blacklist file */
    $filename = "../blacklist.dat";
    $fp = fopen($filename, "r") or die ("Error opening file ... <br>\n");
    while ($line = fgets($fp,255))  {
      $u = explode(" ",$line);
      $u0 = $u[0];
      if (preg_match("/$u0/",$_SERVER['REMOTE_ADDR'])) {$badbot++;}
    }
    fclose($fp);
    if ($badbot > 0) { /* this is a bad bot, reject it */
      sleep(12);
      print ("<html><head>\n");
      print ("<title>Site unavailable, sorry</title>\n");
      print ("</head><body>\n");
      print ("<center><h1>Welcome ...</h1></center>\n");
      print ("<p><center>Unfortunately, due to abuse, this site is temporarily not available ...</center></p>\n");
      print ("<p><center>If you feel this in error, send a mail to the hostmaster at this site,<br>
             if you are an anti-social ill-behaving SPAM-bot, then just go away.</center></p>\n");
      print ("</body></html>\n");
      exit;
    }
?>

I plan to take Scott Chamberlain's advice and to be safe I plan to implement Captcha on the script. If user answers correctly then it'll just die or redirect back to site root. Just for fun I'm throwing the trap in a directory named /admin/ and of coursed adding Disallow: /admin/ to robots.txt.

EDIT: In addition I am redirecting the bot ignoring the rules to this page: http://www.seastory.us/bot_this.htm

回复收藏 0 原文