排除搜索引擎抓取测试子域(带 SVN 存储库)
我有:
- domain.comtesting.domain.com
- 测试
我希望domain.com被搜索引擎抓取和索引,但不是testing.domain.com
域和主域共享相同的SVN存储库,所以我不确定如果单独的 robots.txt 文件可以工作...
I have:
- domain.com
- testing.domain.com
I want domain.com to be crawled and indexed by search engines, but not testing.domain.com
The testing domain and main domain share the same SVN repository, so I'm not sure if separate robots.txt files would work...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
1) 创建单独的 robots.txt 文件(例如,将其命名为 robots_testing.txt)。
2) 将此规则添加到网站根文件夹中的 .htaccess 中:
它将重写(内部重定向)任何对
robots.txt
的请求到robots_testing.txt
IF 域名 = <代码>testing.example.com。或者,执行相反的操作 - 将除
example.com
之外的所有域的robots.txt
的所有请求重写为robots_disabled.txt
:1) Create separate robots.txt file (name it robots_testing.txt, for example).
2) Add this rule into your .htaccess in website root folder:
It will rewrite (internal redirect) any request for
robots.txt
torobots_testing.txt
IF domain name =testing.example.com
.Alternatively, do opposite -- rewrite all requests for
robots.txt
torobots_disabled.txt
for all domains exceptexample.com
:test.domain.com 应该有自己的 robots.txt 文件,如下所示
,位于 http://testing.domain。 com/robots.txt
这将禁止所有机器人用户代理,并且当谷歌也会查看 Noindex 时,我们会考虑它的良好措施。
您还可以将您的子域添加到网站管理员工具 - 通过 robots.txt 进行阻止并提交网站删除(尽管这仅适用于 Google)。有关更多信息,请查看
http://googlewebmastercentral.blogspot.com/2010 /03/url-removal-explained-part-i-urls.html
testing.domain.com should have it own robots.txt file as follows
located at http://testing.domain.com/robots.txt
This will disallow all bot user-agents and as google looks at the Noindex as well we'll just though it in for good measure.
You could also add your sub domain to webmaster tools - block by robots.txt and submit a site removal (though this will be for google only). For some more info have a look at
http://googlewebmastercentral.blogspot.com/2010/03/url-removal-explained-part-i-urls.html