robots.txt 和相对路径
我想禁止我网站上任何 /tmp 文件夹中的任何文件。例如我有:“/anything/tmp/whatever/test.html”、“/stuff/tmp/old/test.html”、“/people/tmp/images.html”等。
将 disallow /tmp/ 放入我的 robots.txt 中以阻止我的网络服务器的整个文件系统中的任何 tmp 文件夹是否足够?或者我是否需要将每条路径都设置为: 禁止/任何内容/tmp/ 禁止/stuff/tmp/ 禁止 /tmp/
或者像这样: 禁止 /*/tmp/
谢谢
I want to disallow any files in any /tmp folder on my site. e.g. I have: "/anything/tmp/whatever/test.html", "/stuff/tmp/old/test.html", "/people/tmp/images.html", and so on.
Is it enough to put disallow /tmp/ into my robots.txt to block any tmp folder in the whole file system of my webserver? Or do I need to put every single path like:
disallow /anything/tmp/
disallow /stuff/tmp/
disallow /tmp/
Or like this:
disallow /*/tmp/
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
直接回答:否
您必须声明要从 robots.txt 中排除的每个目录。
您可以检查 robots.txt 文件的语法 @ http://www.frobee.com/robots -txt-检查
了解有关机器人排除的更多信息 @ http://www.robotstxt.org/orig.html
Straight answer: NO
You'll have to declare each directory you want to exclude from robots.
You can check the syntax of your robots.txt file @ http://www.frobee.com/robots-txt-check
Read more about Robot Exclusion @ http://www.robotstxt.org/orig.html
它实际上取决于 REP 解析器。更高级的解析器确实可以识别通配符语法,但它不是原始规范的一部分。
也就是说,Google 确实支持通配符。根据他们的解析器:
It actually depends on the REP parser. More advanced parsers do recognize wildcard syntax, but it's not part of the original spec.
That said, Google does honor wildcards. According to their parser: