encodeURIComponent 真的有用吗？

发布于 2024-08-20 14:03:04 字数 839 浏览 11 评论 0原文

当向服务器执行http-get请求时，我仍然不明白使用JS函数encodeURIComponent对http-get的每个组件进行编码的优势是什么。

做了一些测试，我发现如果我不使用encodeURIComponent，服务器（使用 PHP）也能正确获取 http-get 请求的值！ 显然我仍然需要在客户端级别对特殊字符和字符进行编码。？ = / ：否则像“peace&love=virtue”这样的 http-get 值将被视为 http-get 请求的新键值对，而不是单个值。但是为什么encodeURIcompenent还编码许多其他字符，例如'è'，它被翻译成%C3%A8，必须使用utf8_decode函数在PHP服务器上解码。

通过使用encodeURIComponent，http-get请求的所有值都是utf8编码的，因此当在PHP中获取它们时，我必须每次对每个$_GET值调用utf8_decode函数，这是相当烦人。

为什么我们不能只对 & 进行编码？？ = / ：字符？

另请参阅： JSencodeURIComponent结果与FORM创建的结果不同它表明，encodeURIComponent 甚至无法正确编码，因为简单的浏览器 FORM GET 以不同的方式对“€”等字符进行编码。所以我仍然想知道这个encodeURIComponent是做什么用的？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你的往事 2024-08-27 14:03:04

那是因为

统一资源标识符 (URI) 是
在[RFC3986]中定义为序列
从有限的字符中选择的
曲目的子集
US-ASCII [ASCII] 字符。

所以官方不支持unicode；请参阅 RFC 了解详细信息。不过，所有现代浏览器都支持它，这就是为什么你得到的结果很好......但对于某些不支持它的浏览器或系统的奇怪情况，你对其进行编码并确保它在所有标准兼容的浏览器中正常工作。

回复收藏 0 原文

治碍 2024-08-27 14:03:04

这是一个字符编码问题（再次）。正如 Gaby 所说，URI 是 ASCII 字符序列（因此只有 0-127 范围内的字节）。因此，任何其他非 ASCII 字符都需要使用百分比进行编码-编码。

由于 UTF-8 是新的“通用字符编码”，现在用户代理将 URI 解释为 UTF-8 编码。但这些 UTF-8 编码的单词本身也使用百分比编码进行编码，因为 URI 不能包含除 ASCII 字符之外的任何其他字符。

这意味着，当您在浏览器的地址字段中输入 http://en.wikipedia.org/wiki/€ 时，您的浏览器会查找 € 的 UTF-8 代码(0xE282AC) 并对其应用百分比编码 (%E2%82%AC)。所以http://en.wikipedia.org/wiki/€实际上会导致http://en.wikipedia.org/wiki/%E2%82%AC 。

为了向您证明这是正确的，只需在您的地址字段中输入 http://en.wikipedia.org/wiki/%E2%82%AC ，您的浏览器可能会将其转换为 http://en.wikipedia.org/wiki/€。这是因为现在用户代理将 URI 解释为 UTF-8 编码。

现在回到您最初的问题，为什么您应该显式应用百分比编码：假设您有一个网页，您想要链接到有关欧元符号的维基百科文章。如果您仅使用普通的 € 编写 URI：

<a href="http://en.wikipedia.org/wiki/€">Euro sign</a>

您的浏览器将使用文档的字符编码来表示 € 字符。这意味着，如果您的文档编码是 Windows-1252（如您的其他问题），则 €将被编码为 0x80，URI 将是 http://en.wikipedia.org/wiki/%80 （这实际上是有效的，因为 Wikipedia 很聪明，因为 Windows-1252 是最流行的字符编码与 0x80 上的可打印字符）。

但如果您的文档编码是 ISO 8859-15，则 € 将被编码为 0xA4，表示 ISO 8859-1 中的货币符号 ¤ （维基百科将选择 ISO 8859-1，因为 0xA4 在 UTF-8 和 HTTP 指定 ISO 8859-1 作为默认字符编码）。

因此，我建议始终使用百分比编码以避免错误。不要让用户代理猜测您的意思。

This is a character encoding issue (again). As Gaby stated, URIs are a sequence of ASCII characters (thus only bytes of the range 0–127). So any other character, that is not in ASCII, needs to be encoded with the Percent-Encoding.

And since UTF-8 is the new “universal character encoding”, nowadays user agents interpret the URI to be UTF-8 encoded. But these UTF-8 encoded words are themselves also encoded with the Percent-Encoding since URIs cannot contain any other characters except those in ASCII.

That means, when you enter http://en.wikipedia.org/wiki/€ into your browser’s address field, your browser looks up the UTF-8 code for € (0xE282AC) and applies the Percent-Encoding on it (%E2%82%AC). So http://en.wikipedia.org/wiki/€ will actually result in http://en.wikipedia.org/wiki/%E2%82%AC.

To show you that this is true, just enter http://en.wikipedia.org/wiki/%E2%82%AC into your address field and your browser will probably turn that into http://en.wikipedia.org/wiki/€. That is because nowadays user agents interpret the URI to be UTF-8 encoded.

Now back to your initial question, why you should apply the Percent-Encoding explicitly: Imagine you have a web page where you want to link to the Wikipedia article on the Euro sign. If you just write the URI with a plain €:

<a href="http://en.wikipedia.org/wiki/€">Euro sign</a>

Your browser will use the character encoding of the document for the € character. That means, if your document’s encoding is Windows-1252 (as in your other question), the € will be encoded as 0x80 and the URI would be http://en.wikipedia.org/wiki/%80 (this actually works because Wikipedia is that clever to guess as Windows-1252 is the most popular character encoding with a printable character on 0x80).

But if your document’s encoding is ISO 8859-15, the € will be encoded as 0xA4 that represents the currency sign ¤ in ISO 8859-1 (Wikipedia will chose ISO 8859-1 because 0xA4 is an invalid byte sequence in UTF-8 and HTTP specifies ISO 8859-1 as default character encoding).

So I recommend to always use the Percent-Encoding to avoid mistakes. Don’t let the user agents guess what you mean.

回复收藏 0 原文

~没有更多了~