除了与号 (&) 之外，还应在 HTML href/src 属性中编码哪些其他字符？

发布于 2024-12-05 03:41:45 字数 701 浏览 7 评论 0原文

& 符号是 HTML 属性中唯一应该编码的字符吗？

众所周知，这不会通过验证：

<a href="http://domain.com/search?q=whatever&lang=en"></a>

因为＆符号应该是&。这是验证失败的直接链接。

这家伙列出了一堆应该编码的字符，但他错了。如果您对 http:// 中的第一个“/”进行编码，则 href 将不起作用。

在 ASP.NET 中，是否已经构建了一个辅助方法来处理这个问题？像 Server.UrlEncode 和 HtmlEncode 这样的东西显然不起作用 - 它们用于不同的目的。

我可以构建自己的简单扩展方法（如 .ToAttributeView()），它执行简单的字符串替换。

原文

Is the ampersand the only character that should be encoded in an HTML attribute?

It's well known that this won't pass validation:

<a href="http://domain.com/search?q=whatever&lang=en"></a>

Because the ampersand should be &. Here's a direct link to the validation fail.

This guy lists a bunch of characters that should be encoded, but he's wrong. If you encode the first "/" in http:// the href won't work.

In ASP.NET, is there a helper method already built to handle this? Stuff like Server.UrlEncode and HtmlEncode obviously don't work - those are for different purposes.

I can build my own simple extension method (like .ToAttributeView()) which does a simple string replace.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鸵鸟症 2024-12-12 03:41:45

除了值的标准 URI 编码之外，&是与 HTML 实体相关的唯一字符，您必须担心，因为这是每个 HTML 实体的开头字符。以以下 URL 为例：

http://query.com/?q=foo<=bar>=baz

即使没有尾随分号，因为 <<是 < 的实体和>是 > 的实体，一些旧的浏览器会将这个 URL 翻译为：

http://query.com/?q=foo<=bar>=baz

所以你需要指定 &作为&;以防止 HTML 解析文档中的链接发生这种情况。

Other than standard URI encoding of the values, & is the only character related to HTML entities that you have to worry about simply because this is the character that begins every HTML entity. Take for example the following URL:

http://query.com/?q=foo<=bar>=baz

Even though there aren't trailing semi-colons, since < is the entity for < and > is the entity for >, some old browsers would translate this URL to:

http://query.com/?q=foo<=bar>=baz

So you need to specify & as & to prevent this from occurring for links within an HTML parsed document.

回复收藏 0 原文

记忆之渊 2024-12-12 03:41:45

转义字符的目的是使它们不会被作为参数处理。因此，您实际上不想对整个 url 进行编码，而只想对通过查询字符串传递的值进行编码。例如：

http://example.com/?parameter1=<ENCODED VALUE>¶meter2=<ENCODED VALUE>

您显示的网址实际上是一个完全有效的网址，将通过验证。但是，浏览器会将 & 符号解释为查询字符串中参数之间的分隔符。因此，您的查询字符串：

?q=whatever&lang=en

实际上会被接收者翻译为两个参数：

q = "whatever"
lang = "en"

为了让您的网址正常工作，您只需要确保您的值被编码：

?q=<ENCODED VALUE>&lang=<ENCODED VALUE>

编辑：您链接的 W3C 的常见问题页面to 讨论的是在 html 中呈现 url 且 & 后跟可解释为实体引用的文本（例如 ©）时的边缘情况。这是 jsfiddle 中的测试，显示网址：

http://jsfiddle.net/YjPHA/1/

在 Chrome 和 FireFox 中，链接可以正常工作，但 IE 将 © 呈现为 ©，从而破坏了链接。我必须承认我在野外从来没有遇到过这个问题（它只会影响那些不需要分号的实体引用，这是一个非常小的子集）。

为了确保您免受此错误的影响，您可以对呈现到页面的任何 URL 进行 HTML 编码，应该没问题。如果您使用 ASP.NET，则 HttpUtility.HtmlEncode 方法应该可以正常工作。

The purpose of escaping characters is so that they won't be processed as arguments. So you actually don't want to encode the entire url, just the values you are passing via the querystring. For example:

http://example.com/?parameter1=<ENCODED VALUE>¶meter2=<ENCODED VALUE>

The url you showed is actually a perfectly valid url that will pass validation. However, the browser will interpret the & symbols as a break between parameters in the querystring. So your querystring:

?q=whatever&lang=en

Will actually be translated by the recipient as two parameters:

q = "whatever"
lang = "en"

For your url to work you just need to ensure that your values are being encoded:

?q=<ENCODED VALUE>&lang=<ENCODED VALUE>

Edit: The common problems page from the W3C you linked to is talking about edge cases when urls are rendered in html and the & is followed by text that could be interpreted as an entity reference (© for example). Here is a test in jsfiddle showing the url:

http://jsfiddle.net/YjPHA/1/

In Chrome and FireFox the links works correctly, but IE renders © as ©, breaking the link. I have to admit I've never had a problem with this in the wild (it would only affect those entity references which don't require a semicolon, which is a pretty small subset).

To ensure you're safe from this bug you can HTML encode any of your URLS you render to the page and you should be fine. If you're using ASP.NET the HttpUtility.HtmlEncode method should work just fine.

回复收藏 0 原文

云巢 2024-12-12 03:41:45

这里不需要 HTML 擒纵装置：

<a href="http://domain.com/search?q=whatever&lang=en"></a>

根据 HTML5 规范：
http://www.w3.org /TR/html5/tokenization.html#character-reference-in-attribute-value-state

&lang= 应该被解析为不可识别的字符引用，并且属性的值应该按原样使用：http://domain.com/search?q=whatever&lang=en

参考：向 HTML5 WG 添加问题：http://lists.w3.org/Archives/Public/public-html/2011Sep/0163.html

You do not need HTML escapement here:

<a href="http://domain.com/search?q=whatever&lang=en"></a>

According to the HTML5 spec:
http://www.w3.org/TR/html5/tokenization.html#character-reference-in-attribute-value-state

&lang= should be parsed as non-recognized character reference and value of the attribute should be used as it is: http://domain.com/search?q=whatever&lang=en

For the reference: added question to HTML5 WG: http://lists.w3.org/Archives/Public/public-html/2011Sep/0163.html

回复收藏 0 原文