搜索引擎是否尊重 HTTP 标头字段“内容位置”?

发布于 2024-07-11 21:16:12 字数 657 浏览 5 评论 0原文

我想知道搜索引擎是否尊重 HTTP 标头字段 内容位置

例如,当您想要从 URL 中删除会话 ID 参数时,这可能很有用:

GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…

HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…

说明:
我不想重定向请求,因为删除会话 ID 会导致完全不同的请求,因此可能也会导致不同的响应。 我只想说明,所附响应也可以在其“主 URL”下找到。

也许我的例子并不能很好地表达我的问题的意图。 所以请看一下 目的是什么HTTP 标头字段“Content-Location”?

I was wondering whether search engines respect the HTTP header field Content-Location.

This could be useful, for example, when you want to remove the session ID argument out of the URL:

GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…

HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…

Clarification:
I don’t want to redirect the request, as removing the session ID would lead to a completely different request and thus probably also a different response. I just want to state that the enclosed response is also available under its “main URL”.

Maybe my example was not a good representation of the intent of my question. So please take a look at What is the purpose of the HTTP header field “Content-Location”?.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

一绘本一梦想 2024-07-18 21:16:12

我认为 Google 刚刚公布了我的问题的答案:canonical< /code> 用于声明规范 URL 的链接关系

Google 的Maile Ohye 写道:

<块引用>

米奇C说...
您应该使用 Content-Location 标头,如下所示:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
“14.14 内容-位置”

@MikeyC:是的,从理论角度来看这是有道理的,我们当然考虑过它。 然而,有几点让我们做出了选择:

  1. 我们的数据显示,许多网站上的“Content-Location”标头配置不正确。 有时,网站管理员会提供又长又难看的网址,甚至不重复——这可能是无意的。 他们可能不知道他们的网络服务器甚至正在发送 Content-Location 标头。

    联系网站所有者来清理整个网络的内容位置问题将非常耗时。 我们意识到,如果我们从头开始,我们可以更快地提供功能。 与微软和雅虎合作! 为了支持这种格式,网站管理员只需要学习一种语法。

  2. 网站管理员通常很难配置其 Web 服务器标头,但可以更轻松地更改其 HTML。 rel="canonical" 似乎是一个友好的属性。

http://googlewebmastercentral.blogspot.com/2009 /02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625

I think Google just announced the answer to my question: the canonical link relation for declaring the canonical URL.

Maile Ohye from Google wrote:

MickeyC said...
You should have used the Content-Location header instead, as per:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
"14.14 Content-Location"

@MikeyC: Yes, from a theoretical standpoint that makes sense and we certainly considered it. A few points, however, led us to choose :

  1. Our data showed that the "Content-Location" header is configured improperly on many web sites. Sometimes webmasters provide long, ugly URLs that aren’t even duplicates -- it's probably unintentional. They're likely unaware that their webserver is even sending the Content-Location header.

    It would've been extremely time consuming to contact site owners to clean up the Content-Location issues throughout the web. We realized that if we started with a clean slate, we could provide the functionality more quickly. With Microsoft and Yahoo! on-board to support this format, webmasters need to only learn one syntax.

  2. Often webmasters have difficulty configuring their web server headers, but can more easily change their HTML. rel="canonical" seemed like a friendly attribute.

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625

数理化全能战士 2024-07-18 21:16:12

大多数像样的爬虫都会遵循内容位置。 因此,是的,搜索引擎会尊重 Content-Location 标头,尽管这并不能保证具有 sid 参数的 URL 不会出现在结果页面上。

Most decent crawlers do follow Content-Location. So, yes, search engines respect the Content-Location header, although that is no guarantee that the URL having the sid parameter will not be on the results page.

陪你到最终 2024-07-18 21:16:12

2009 年,Google 开始在响应正文中查看符合 rel=canonical 资格的 URI。

看起来自 2011 年以来,按照 RFC5988 格式化的链接也从标头字段链接:网站站长工具常见问题解答中也明确提到它是有效的选项。

我猜这是为搜索引擎提供一些额外的超媒体面包屑的最新方式 - 因此当您实际上不需要将其作为内容提供时,可以让您将它们排除在响应正文之外。

In 2009 Google started looking at URIs qualified as rel=canonical in the response body.

Looks like since 2011, links formatted as per RFC5988 are also parsed from the header field Link:. It is also clearly mentioned in the Webmaster Tools FAQ as a valid option.

Guess this is the most up-to-date way of providing search engines some extra hypermedia breadcrumbs to follow - thus allow keeping you to keep them out of the response body when you don't actually need to serve it as content.

傲娇萝莉攻 2024-07-18 21:16:12

除了使用“位置”而不是“内容位置”之外,还可以根据重定向原因在响应中使用正确的 HTTP 状态代码。 搜索引擎倾向于支持永久重定向 (301) 状态而不是临时 (302) 状态。

In addition to using 'Location' rather than 'Content-Location' use the proper HTTP status code in your response depending on your reason for redirect. Search engines tend to favor permanent redirect (301) status vs temporary (302) status.

╰◇生如夏花灿烂 2024-07-18 21:16:12

尝试使用“位置:”标题。

Try the "Location:" header instead.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文