首先将 html 代码发送给客户端(在其他代码之前)
我想这样:
<NOSCRIPT>
<META HTTP-EQUIV="refresh" CONTENT="1; URL=js/nonJs.htm">
</NOSCRIPT>
作为服务器发送到浏览器的第一件事(在所有其他信息之前),
我应该放置代码吗?
谢谢,
我需要使用这个脚本来阻止机器人(该网站位于专用服务器上,但来自机器人的流量太多,导致该网站无法使用)
在我的网站上实施此代码后,速度几乎是原来的两倍!但仍然很慢(访问者的损失很小 - 没有启用 js 的人...)但他们现在的方式是网站加载,然后没有 js 脚本启动 - 但我更喜欢侧面不在验证js脚本之前加载(也许这在服务器端是可能的???)
I would like this:
<NOSCRIPT>
<META HTTP-EQUIV="refresh" CONTENT="1; URL=js/nonJs.htm">
</NOSCRIPT>
to be the first thing the server sends to the browser (before all other info)
were should I place the code?
thanks
I need to use this script to block bots (the site is on a dedicated server but so much traffic comes from bots that it makes the site unusable )
after implementing this code on my site it is almost twice as fast!!! but still quite slow (and the loss of visitors is minimal - who does not have js enabled...) but they way it is now is that the site loads and then the no js script kicks in - but I would prefer the side not loading before the js script has been verified (maybe this is possible on the server side???)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以将该 HTML 片段保留在您的 head 部分中,因为它发送的顺序并不重要。
如果机器人是一半不错,它将能够运行 JS,更多的机器人可能会忽略刷新命令。
如果您的所有流量都来自特定的机器人,您可以记录其用户代理字符串,然后使用服务器端或 htacces 完全阻止它(如果您使用的是 Apache): http://www.thesitewizard.com/apache/block-bots-with-htaccess.shtml
如果它是 Google 或 Bing 等大型机器人,然后他们将遵循规则,您可以创建一个 robots.txt: http://www.robotstxt.org/ robotstxt.html - 这将允许机器人仅访问您希望它们访问的页面。
You can keep that piece of HTML in your head section as it doesn't really matter what order it is sent.
If the bot is half decent it will be able to run JS, further more bots will probably ignore the refresh command andway.
If all your traffic is coming from a particular bot you can record its user agent string and just block it altogether using something server side or htacces if your on Apache: http://www.thesitewizard.com/apache/block-bots-with-htaccess.shtml
If its a large bot like Google or Bing, then they will follow the rules and you can create a robots.txt: http://www.robotstxt.org/robotstxt.html - This will allow robots to only access pages you want them to.