规范标签和UTF8
以下 2 个规范链接标签会被蜘蛛视为指向同一个 URL 吗?
- 编码
- 未编码
Would the following 2 canonical link tags be viewed by spiders as pointing to the same URL?
<link rel="canonical" href="http://www.example.com/ŷ" />
- encoded<link rel="canonical" href="http://www.example.com/ŷ" />
- unencoded
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
ŷ
是一个 HTML 实体,以十进制表示法表示代码点为 375 的 Unicode 字符。在十六进制中它是 0x177,所以我们讨论的是 U+0177,即ŷ
。这意味着两个 URL 完全相同,如果:
如果浏览器在这两种情况下都显示 ŷ,则字符集可能是正确的,但您应该确保它是正确的。
ŷ
is an HTML entity that represents the Unicode character with code point 375 in decimal notation. In hexadecimal it'd be 0x177 so we are talking about U+0177 which isŷ
.That means that both URLs are exactly the same if:
If the browser displays ŷ in both cases it's likely that character set is correct but you should make sure it is.
如果您将 HTML 作为 UTF-8 进行通信,则 url 会被视为相同。
if you communicate your HTML as UTF-8 the url is seen as the same.
不是 100% 确定,但我认为它们都指向相同的 URL。但请记住,查看 W3 标准时,他们经常建议对链接进行编码。
Not 100% sure, but I think they both would point to the same URL. But keep in mind, that looking at W3 standards, they often suggest links to be encoded.
尽管您可以期望它在现代浏览器中工作,
http://www.example.com/ŷ
是一个无效的 URL。您应该始终对 unicode 字符进行百分比编码。
Even though you can expect it to work in modern browsers,
http://www.example.com/ŷ
is an invalid URL.You should always percent encode unicode characters.