每页最大链接数
与营销人员就站点地图进行了交谈。有人指出,单个页面不应包含超过 100 个链接,因为 Google 在抓取页面时不会跟踪超过 100 个的链接。我以前没有听说过这个限制。
我进行了一些搜索,发现Google 网站站长指南< /a> 用于声明“将给定页面上的链接保持在合理的数量(少于 100)。” [2008] Google 网站站长指南现在仅声明“将给定页面上的链接保留到一个合理的数字。”
在为 1,000 个页面的网站设计站点地图架构时(或任何页面上的链接列表)将所有内容放置在一个位置是否可以接受?单个站点地图页面上有 1,000 个链接,还是应该使用多个站点地图?
此外,提交 XML 站点地图是否会使 HTML 站点地图对 Google 蜘蛛的重要性失效? 如果是这样,那么我可以想象仅在 HTML 站点地图上放置重要链接,而不是每个页面的链接,以根据最终用户的可用性定制页面。
Had a conversation about sitemaps with someone from marketing. It was stated that a single page shouldn't have more than 100 links because Google will not follow more than 100 when crawling pages. I had not heard of this limit before.
I did some searching and found that Google's Webmaster Guidelines used to state "keep the links on a given page to a reasonable number (fewer than 100)." [2008] The Google Webmaster Guidelines now just state "keep the links on a given page to a reasonable number."
When engineering a sitemap architecture for a site of say 1,000 pages (or link list on any page for that matter) would it be acceptable to place all 1,000 links on a single sitemap page or should multiple sitemaps be used?
Also, does submitting an XML sitemap nullify the importance of an HTML sitemap to Google's spider? If so, then I would imagine only placing important links on the HTML sitemap instead of a link to every page to tailor the page to end-user usuability.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
取决于您指的是针对用户的站点地图(这就是@adrian-k 的回答)还是针对机器人(即搜索引擎)的站点地图。
如果是第二种,那么答案是:你可以(可能应该)每页有几千个链接。通过在页面中包含“lastmod”值以及对页面本身进行 gzip 压缩,还可以让爬虫的工作变得更轻松。
有关此类站点地图的有效格式的信息,请参阅 http://www.sitemaps.org/protocol.php
只是为了验证一下,看看大佬们在做什么。在大多数情况下,您会在 /robots.txt 底部找到对站点地图页面的引用。例如,http://www.linkedin.com/robots.txt 或 https://profiles.google.com/robots.txt
LinkedIn 的站点地图,位于 http://partner.linkedin.com/sitemaps/smindex.xml.gz,列出另一个 2630 gzip 压缩的迷你站点地图:
curl http://partner.linkedin.com/sitemaps/smindex.xml.gz |枪拉链 | wc -l </code>。
Google 在其 Google 配置文件站点地图 (http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml) 上列出了 7104 个此类页面 -
curl http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml | grep '' | wc -l
浏览一下您所在行业中具有 SEO 意识的成员的网站,您应该会找到更多示例(或者发现您可以利用这些知识轻松击败他们)。
Depends on whether you're referring to sitemaps targeted towards users (which is what @adrian-k answered about) or sitemaps targeted towards robots (i.e. search engines).
If it's the second kind, then the answer is: you can (probably should) have several thousand links per page. It also pays to make life easier on your crawlers by including 'lastmod' values for your pages and by gzipping the page itself.
For information on valid formats for such sitemaps see http://www.sitemaps.org/protocol.php
Just to validate, take a look at what the big boys are doing. In most cases you'll find a reference to the sitemap page at the bottom of /robots.txt. For instance, http://www.linkedin.com/robots.txt or https://profiles.google.com/robots.txt
LinkedIn's sitemap, at http://partner.linkedin.com/sitemaps/smindex.xml.gz, lists another 2630 gzipped mini-sitemaps:
curl http://partner.linkedin.com/sitemaps/smindex.xml.gz | gunzip | wc -l
.Google's lists 7104 such pages on their Google Profiles sitemap (http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml) -
curl http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml | grep '<loc>' | wc -l
Play around websites of SEO-aware members of your industry and you should find some more examples (or discover that you can beat them hands-down with this knowledge).
我想说的是,站点地图是为用户提供的 - 而不是搜索引擎,所以是的,它是可以接受的(但可能仍然会带来可用性挑战)。
站点地图以这样的方式布置站点,使人们可以快速理解给定站点的结构和内容,并帮助他们获得他们想要的内容。
通过说搜索引擎需要能够“消化”整个站点地图,表明某些内容只能通过站点地图访问 - 事实不应该如此。
I would say a site map is there for users - not search engines, so yes it's acceptable (but might still present usability challenges).
A site map lays out a site in such a way that a person can quickly understand the structure and content of a given site, and help them get to what they want.
By saying that a search engine needs to be able to 'digest' an entire site map suggests that some content is only accessible via the site map - which should not be the case.