空格何时应编码为加号 (+) 或 %20?

发布于 2024-08-29 06:26:05 字数 83 浏览 5 评论 0原文

有时,空格的 URL 会编码为 + 符号,有时会编码为 %20。有什么区别以及为什么会发生这种情况?

Sometimes the spaces get URL encoded to the + sign, and some other times to %20. What is the difference and why should this happen?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

泪冰清 2024-09-05 06:26:05

+ 表示 application/x-www-form-urlencoded 内容中的空格,例如 URL 的查询部分:

http://www.example.com/path/foo+bar/path?query+name=query+value

在此 URL 中,参数名称为查询名称,带空格,值为查询值,带空格,但路径中的文件夹名称字面上是foo+bar< /code>,不是 foo bar

%20 是在这些上下文中对空格进行编码的有效方法。因此,如果您需要对字符串进行 URL 编码以包含在 URL 的一部分中,则始终可以安全地将空格替换为 %20 并将加号替换为 %2B。例如,这就是 encodeURIComponent() 在 JavaScript 中所做的事情。不幸的是,这不是 urlencode 在 PHP 中所做的事情(rawurlencode 更安全)。

另请参阅

HTML 4.01 规范 application/x-www-form- urlencoded

+ means a space only in application/x-www-form-urlencoded content, such as the query part of a URL:

http://www.example.com/path/foo+bar/path?query+name=query+value

In this URL, the parameter name is query name with a space and the value is query value with a space, but the folder name in the path is literally foo+bar, not foo bar.

%20 is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20 and pluses with %2B. This is what, e.g., encodeURIComponent() does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).

See Also

HTML 4.01 Specification application/x-www-form-urlencoded

北恋 2024-09-05 06:26:05

所以,这里的答案都有点不完整。 中明确定义了使用 %20 对 URL 中的空格进行编码RFC 3986,它定义了 URI 的构建方式。本规范中没有提及使用 + 来编码空格 - 如果您仅遵循本规范,则空格必须编码为 %20

使用 + 编码空间的提及来自 HTML 规范的各种版本 - 特别是在描述内容类型 application/x-www-form-urlencoded 的部分中。这用于发布表单数据。

现在,HTML 2.0规范(RFC 1866)明确表示在第 8.2.2 节中,GET 请求 URL 字符串的查询部分应编码为 application/x-www-form-urlencoded。从理论上讲,这表明在查询字符串的 URL 中使用 + 是合法的(在 ? 之后)。

但是……真的吗?请记住,HTML 本身就是一种内容规范,带有查询字符串的 URL 可以与 HTML 以外的内容一起使用。此外,虽然 HTML 规范的更高版本继续将 + 定义为 application/x-www-form-urlencoded 内容中的合法内容,但它们完全省略了 GET 的部分请求查询字符串被定义为该类型。事实上,HTML 2.0 规范之后的任何内容中都没有提及查询字符串编码。

这给我们留下了一个问题——它有效吗?当然,有很多遗留代码支持查询字符串中的+,并且还有很多代码生成它。因此,如果使用 +,您很可能不会崩溃。 (事实上​​,我最近对此进行了所有研究,因为我发现一个主要网站未能在 GET 查询作为空格。它们实际上无法解码任何百分比编码字符。因此您正在使用的服务可能与以下相关:好吧。)

但是从纯粹阅读规范来看,如果没有将 HTML 2.0 规范中的语言延续到更高版本中,URL 完全被 RFC 3986,这意味着空格应该转换为 %20。如果您请求的是 HTML 文档以外的任何内容,那么情况肯定会如此。

So, the answers here are all a bit incomplete. The use of a %20 to encode a space in URLs is explicitly defined in RFC 3986, which defines how a URI is built. There is no mention in this specification of using a + for encoding spaces - if you go solely by this specification, a space must be encoded as %20.

The mention of using + for encoding spaces comes from the various incarnations of the HTML specification - specifically in the section describing content type application/x-www-form-urlencoded. This is used for posting form data.

Now, the HTML 2.0 specification (RFC 1866) explicitly said, in section 8.2.2, that the query part of a GET request's URL string should be encoded as application/x-www-form-urlencoded. This, in theory, suggests that it's legal to use a + in the URL in the query string (after the ?).

But... does it really? Remember, HTML is itself a content specification, and URLs with query strings can be used with content other than HTML. Further, while the later versions of the HTML spec continue to define + as legal in application/x-www-form-urlencoded content, they completely omit the part saying that GET request query strings are defined as that type. There is, in fact, no mention whatsoever about the query string encoding in anything after the HTML 2.0 specification.

Which leaves us with the question - is it valid? Certainly there's a lot of legacy code which supports + in query strings, and a lot of code which generates it as well. So odds are good you won't break if you use +. (And, in fact, I did all the research on this recently because I discovered a major site which failed to accept %20 in a GET query as a space. They actually failed to decode any percent encoded character. So the service you're using may be relevant as well.)

But from a pure reading of the specifications, without the language from the HTML 2.0 specification carried over into later versions, URLs are covered entirely by RFC 3986, which means spaces ought to be converted to %20. And definitely that should be the case if you are requesting anything other than an HTML document.

梦屿孤独相伴 2024-09-05 06:26:05

http://www.example.com/some/path/to/resource?param1=value1

问号之前的部分必须使用 % 编码(因此 %20 为空格),在问号后可以使用 %20+ 来表示空格。如果您需要在问号后添加实际的 +,请使用 %2B

http://www.example.com/some/path/to/resource?param1=value1

The part before the question mark must use % encoding (so %20 for space), after the question mark you can use either %20 or + for a space. If you need an actual + after the question mark use %2B.

风追烟花雨 2024-09-05 06:26:05

出于兼容性原因,最好始终将空格编码为“%20”,而不是“+”。

这是 RFC 1866(HTML 2.0 规范),它指定应对空格字符进行编码作为“application/x-www-form-urlencoded”内容类型键值对中的“+”。 (参见第 8.2.1 段第 1 段)。这种表单数据的编码方式在后面的HTML规范中也给出了,查找application/x-www-form-urlencoded的相关段落。

以下是 URL 字符串的示例,其中 RFC 1866 允许将空格编码为加号:“http://example.com/over/there?name=foo+bar”。因此,根据 RFC 1866,只有在“?”之后,空格才可以用加号替换。在其他情况下,空格应编码为 %20。但由于很难确定上下文,因此最好不要将空格编码为“+”。

我建议对除 RFC 3986 中定义的“unreserved”之外的所有字符进行百分比编码,第 2.3 页。

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

您可能希望将空格编码为“+”(一个字节)而不是“%20”(三个字节)的唯一情况是当您确定如何解释上下文,并且查询字符串的大小为本质。

For compatibility reasons, it's better to always encode spaces as "%20", not as "+".

It was RFC 1866 (HTML 2.0 specification), which specified that space characters should be encoded as "+" in "application/x-www-form-urlencoded" content-type key-value pairs. (see paragraph 8.2.1. subparagraph 1.). This way of encoding form data is also given in later HTML specifications, look for relevant paragraphs about application/x-www-form-urlencoded.

Here is an example of a URL string where RFC 1866 allows encoding spaces as pluses: "http://example.com/over/there?name=foo+bar". So, only after "?", spaces can be replaced by pluses, according to RFC 1866. In other cases, spaces should be encoded to %20. But since it's hard to determine the context, it's the best practice to never encode spaces as "+".

I would recommend to percent-encode all characters except "unreserved" defined in RFC 3986, p.2.3.

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

The only situation when you may want to encode spaces as "+" (one byte) rather than "%20" (three bytes) is when you know for sure how to interpret the context, and when the size of the query string is of the essence.

最终幸福 2024-09-05 06:26:05

有什么区别?请参阅其他答案。

什么时候应该使用+而不是%20?如果出于某种原因您想要生成 URL 查询字符串 (?.....) 或哈希片段 (#....,请使用 +)更具可读性。示例:您实际上可以阅读以下内容:

https://www.google.se/#q=google+doesn%27t+encode+:+and+uses+%2B+instead+of+spaces
(%2B = +)

但以下内容更难阅读(至少对我来说):

https://www.google.se/#q=google%20doesn%27t% 20oops%20:%20%20this%20text%20%2B%20is%20 Different%20spaces

我认为 + 不太可能破坏任何东西,因为 Google 使用 + (请参阅上面的第一个链接),他们可能已经考虑过这一点。我自己将使用 + 只是因为可读 + Google 认为它没问题。

What's the difference? See the other answers.

When should we use + instead of %20? Use + if, for some reason, you want to make the URL query string (?.....) or hash fragment (#....) more readable. Example: You can actually read this:

https://www.google.se/#q=google+doesn%27t+encode+:+and+uses+%2B+instead+of+spaces
(%2B = +)

But the following is a lot harder to read (at least to me):

https://www.google.se/#q=google%20doesn%27t%20oops%20:%20%20this%20text%20%2B%20is%20different%20spaces

I would think + is unlikely to break anything, since Google uses + (see the 1st link above) and they've probably thought about this. I'm going to use + myself just because readable + Google thinks it's OK.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文