阻止搜索引擎索引开发网站
我认为我的一个网站最近从谷歌中除名,因为它发现并开始索引我的开发网站。它基本上是我的主网站的复制品。 (dev.site.com 和 site.com)
无论如何,有没有一种方法可以创建一个 robots.txt 来阻止对 dev.site.com 的任何流量进行索引,从而使 site.com 仍然被完全索引。
我知道我可以为每个机器人都有单独的文件,但拥有一个涵盖这两个机器人的文件会更容易。特别是因为我与拥有开发站点的整个站点合作,并且只想有一个简单的工作流程,并且当我推送新版本的站点时不必更改机器人文件。
i think one of my sites recently got delisted from google because it found and started indexing my dev site. it is basically a replica of my main site. (dev.site.com & site.com)
anyway, is there a way to create one robot.txt that would prevent any traffic to dev.site.com from being indexed, leaving site.com to still be fully indexed.
i know i could just have separate robot files for each, but it would just be easier to have one that covers both. especially since i work with a whole of sites which have dev sites, and would just like to have an easy workflow and not have to change the robot files when i push new versions of site to live.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
也许您可以动态地提供 robots.txt 文件,例如通过 PHP:
Perhaps you could serve the robots.txt file dynamically, e.g. via PHP:
另一种方法是在 .htaccess 文件中添加一行:
标头设置 X-Robots-Tag“noindex, nofollow”
这被主张优于 robots.txt,就好像有指向您的开发网站的链接一样,搜索引擎将报告该链接(即使它们没有为您的网站建立索引) 。这里提倡的是:
http://yoast.com/prevent-site-being-indexed/
Another approach is to add a line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
This is advocated to be superior to the robots.txt as if there is a link to your dev site the search engines will report the link (even if they do not index your site). This is advocated here:
http://yoast.com/prevent-site-being-indexed/
标准的一部分是每个子域必须有自己的 robots.txt(如果从 dev.site.com 访问;您不需要为 site.com/dev 提供另一个)。
It's part of the standard that each subdomain must have its own robots.txt (if being accessed from dev.site.com; you wouldn't need another for site.com/dev).