在 robots.txt 中列出站点地图和站点地图索引文件?
我的网站由 3 个主要部分组成:评论、论坛和博客。我有论坛和博客的插件,可以自动为其生成站点地图。论坛插件生成一个指向多个索引的站点地图索引文件,博客插件生成一个包含我所有博客内容的常规站点地图文件。以下是 robots.txt 中的条目:
Sitemap: http://www.datesphere.com/forum/sitemap-index.xml
Sitemap: http://www.datesphere.com/blog/sitemap.xml
我刚刚创建了一个评论 sitemap.xml 文件,其中包含评论部分中的所有内容。我打算在 robots.txt 中添加一行,这样整个事情就会如下所示:
Sitemap: http://www.datesphere.com/forum/sitemap-index.xml
Sitemap: http://www.datesphere.com/blog/sitemap.xml
Sitemap: http://www.datesphere.com/reviews-sitemap.xml
这是我的问题:我知道您可以在 robots.txt 中列出多个站点地图,但是可以同时拥有站点地图索引文件和站点地图索引文件吗?列出多个站点地图?如果 Googlebot 在 robots.txt 中找到 sitemap-index.xml 文件,它会忽略其他站点地图文件吗?如果是这样,我是否必须将我的博客和评论站点地图放入另一个站点地图索引文件中,然后将其列出在 robots.txt 中?
我已经检查过,但只能找到“我可以列出多个站点地图吗?”这个问题的答案。
My site is comprised of 3 main sections: Reviews, Forum, and Blog. I have plugins for the forum and blog that automatically generate sitemaps for them. The forum plugin generates a sitemap INDEX file pointing to multiple indexes, and the blog plugin generates a regular sitemap file containing all my blog content. Here are their entries from robots.txt:
Sitemap: http://www.datesphere.com/forum/sitemap-index.xml
Sitemap: http://www.datesphere.com/blog/sitemap.xml
I just created a Reviews sitemap.xml file that contains all the content in the Reviews section. I was planning to just add a line to robots.txt so the whole thing would look like this:
Sitemap: http://www.datesphere.com/forum/sitemap-index.xml
Sitemap: http://www.datesphere.com/blog/sitemap.xml
Sitemap: http://www.datesphere.com/reviews-sitemap.xml
HERE'S MY QUESTION: I know you can list multiple sitemaps in robots.txt, but is it OK to have a sitemap index file as well as multiple sitemaps listed? Will Googlebot ignore the other sitemap files if it finds a sitemap-index.xml file in robots.txt? If so, do I have to put my blog and reviews sitemaps in another sitemap index file and just list that in robots.txt?
I've checked around but can only find answers to the question "can I list multiple sitemaps?"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
即使您也列出了其父站点地图索引,Googlebot 也不会忽略您在 robots.txt 中列出的任何站点地图。我们几乎跟踪我们找到的每个链接,如果允许的话,我们会抓取它们。
就我个人而言,我可能只列出站点地图索引,尽管只是为了便于管理,但这取决于您,如果您同时列出索引和站点地图,Googlebot 不会介意。
Googlebot will not ignore any of the Sitemaps you list in robots.txt even if you list their parent Sitemap Index, too. We follow pretty much every link we find and if we're allowed to, we'll crawl them.
Personally, I'd probably list only the Sitemap Indexes, though only for manageability's sake, but it's up to you, Googlebot won't mind if you list both the indexes and the Sitemaps.
当您有多个站点地图时,您可以在 robots.txt 文件中指定站点地图索引文件 URL,如下例所示:
或者,您可以指定多个站点地图文件的单独 URL,如下例所示:
最后,这是在 robots.txt 文件中添加 Sitemap 指令时需要注意的。
When you have multiple sitemaps, you can either specify your sitemap index file URL in your robots.txt file as shown in the example below:
Or, you can specify individual URLs of your multiple sitemap files, as shown in the example below:
Finally, this is what you need to pay attention to when adding the Sitemap directive to the robots.txt file.