noindex、noarchive 死网页? 你认为什么是正确的?
我注意到相当多的网络应用程序,特别是不能很好地处理已删除的页面/帐户。
首先,我要声明,我的立场是内容所有者始终拥有内容,如果内容被删除或所有者删除帐户,提供商/网络应用程序应尽一切可能停止对所述内容建立索引。
对此,我认为合理的策略是在其元标记中将 404 页面和占位符页面设置为 noindex、nofollow 和 noarchive。
举几个例子,当帐户被删除时,Flickr 不会执行任何操作,而是返回显示帐户已删除的页面。
www.flickr.com/people/rebelchrome
Friendfeed 返回没有特殊元标记的 404。
您认为在这种情况下最好/正确的做法是什么?
I've noticed that a fair number of web apps, in particular do not deal very well with pages/accounts that have been deleted.
First off, I'll state that I am of the position that the content owner always owns the content and that if the content is deleted or the owner deletes the account, the provider/web app should do everything possible to stop indexing of said content.
To that, I would think that a reasonable strategy would be to set 404 pages and placeholder pages to noindex, nofollow and noarchive in their meta tags.
So a couple example cases, Flickr does not do either when an account is deleted, instead it returns page saying account is deleted.
www.flickr.com/people/rebelchrome
Friendfeed returns a 404 with no special meta tags.
What do you think is the best/right to do in situations like this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
响应代码 410 Gone 用于死网页(不再存在且没有明显替代方案的网页)。 该页面仍然可以返回正文。
遇到 410 Gone 状态响应的搜索引擎将能够意识到该页面不再存在,并可以采取相应的行动 - 对于大多数搜索引擎来说,这意味着只需将其从索引中删除即可。
人们遇到该页面只会看到页面正文。 就像 404 一样,您可以有一个自定义 410 页面,这可能是类似的 - 包含一条简短的消息,表明该页面不再存在,也许还有一个迷你站点地图和搜索框,允许用户在该页面上查找替代内容地点。
当页面返回 410/404 响应时,实际上没有必要使用像 noindex 这样的机器人指令,因为响应代码确实说明了一切。
您链接到的 flickr 页面包含消息、迷你站点地图和搜索框,但可能应该返回 410 或 404 错误响应,而不是像它那样返回 200 响应。
The response code 410 Gone is for dead web pages (web pages which no longer exist and for which there is no obvious alternative). The page can still return a body.
Search engines encountering a 410 Gone status response will be able to realise that the page no longer exists and can act accordingly - for most search engines, this would mean simply taking it out of their index.
Humans encountering the page will just see the page body. Just like with a 404, you can have a custom 410 page, and this could be similar - containing a brief message that the page no longer exists, and maybe also a mini site map and search box allowing the user to find alternative content on the site.
Using robots instructions like noindex are not really necessary when the page returns a 410/404 response, because the response code says it all, really.
The flickr page you linked to has the message, mini site map and search box, but probably should return a 410 or 404 error response, not a 200 response as it does.