The first rule will allow news.php with parameters but allow news.php without ?id=__. If you do not want to crawl news.php that you have to use /news.php*
The Allow and Disallow lines in robots.txt say, "allow (or disallow) anything that starts with".
So:
Disallow: /news.php
is the same as
Disallow: /news.php*
Provided, of course, that the bot reading robots.txt understands wildcards. If the bot doesn't understand wildcards, then it will treat the asterisk as a part of the actual file name.
An asterisk at the end of the line is superfluous, and potentially hazardous.
发布评论
评论(3)
更多信息请点击此处
编辑:
第一条规则将允许带有参数的 news.php,但允许不带 ?id=__ 的 news.php。如果您不想抓取 news.php,则必须使用 /news.php*
More info here
EDIT:
The first rule will allow news.php with parameters but allow news.php without ?id=__. If you do not want to crawl news.php that you have to use /news.php*
robots.txt 中的“允许”和“禁止”行表示“允许(或禁止)以”开头的任何内容。
因此:
Disallow: /news.php
相同
与
Disallow: /news.php*
,当然前提是读取 robots.txt 的机器人能够理解通配符。如果机器人不理解通配符,那么它会将星号视为实际文件名的一部分。
行尾的星号是多余的,并且有潜在危险。
The Allow and Disallow lines in robots.txt say, "allow (or disallow) anything that starts with".
So:
Disallow: /news.php
is the same as
Disallow: /news.php*
Provided, of course, that the bot reading robots.txt understands wildcards. If the bot doesn't understand wildcards, then it will treat the asterisk as a part of the actual file name.
An asterisk at the end of the line is superfluous, and potentially hazardous.
肯定
是正确的。
如果您有完整的文件名,则不需要星号。
不过,这种
方法是否有效对我来说很有趣。
For sure
is correct.
No stars are needed if you have the full filename.
It is though interesting for me wheather the
approach can work.