GET 参数中允许的字符

发布于 2024-08-05 12:53:41 字数 344 浏览 7 评论 0原文

GET 参数中允许使用哪些字符而不进行编码或转义?我的意思是这样的:

http://www.example.org/page.php?name =XYZ

那里除了 XYZ 之外还能有什么?我认为只有以下字符:

  • az (AZ)
  • 0-9
  • -
  • _

这是完整列表还是是否允许添加其他字符?

Which characters are allowed in GET parameters without encoding or escaping them? I mean something like this:

http://www.example.org/page.php?name=XYZ

What can you have there instead of XYZ? I think only the following characters:

  • a-z (A-Z)
  • 0-9
  • -
  • _

Is this the full list or are there additional characters allowed?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

反目相谮 2024-08-12 12:53:41

保留字符,它们具有保留的含义,它们是分隔符 - :/?#[]@ - 和子分隔符 - !$&'()*+,;=

还有一组字符称为非保留字符 — 字母数字和 - ._~ — 不进行编码。

这意味着,当任何不属于非保留字符集的内容没有特殊含义时(例如,作为 GET 参数的一部分传递时),都应该进行 % 编码。

另请参阅 RFC3986:统一资源标识符 (URI):通用语法

There are reserved characters, that have a reserved meanings, those are delimiters — :/?#[]@ — and subdelimiters — !$&'()*+,;=

There is also a set of characters called unreserved characters — alphanumerics and -._~ — which are not to be encoded.

That means, that anything that doesn't belong to unreserved characters set is supposed to be %-encoded, when they do not have special meaning (e.g. when passed as a part of GET parameter).

See also RFC3986: Uniform Resource Identifier (URI): Generic Syntax

断爱 2024-08-12 12:53:41

该问题询问哪些字符允许出现在 GET 参数中,而无需对其进行编码或转义

根据 RFC3986 (通用 URL 语法)和 RFC7230,第 2.7.1 节(HTTP/S URL 语法)唯一需要进行百分比编码的字符是那些外部字符查询集,请参阅下面的定义。

但是,还有其他规范,例如 HTML5、Web 表单和过时的索引搜索、W3C 推荐。这些文档为某些字符(特别是 = & 等符号)添加了特殊含义。 + ;

这里的其他答案表明应该对大多数保留字符进行编码,包括“/”“?”。这是不正确的。事实上,RFC3986,第 3.4 节 建议不要对“/”进行百分比编码“?”人物。

有时为了可用性更好,避免百分比-
对这些字符进行编码。

RFC3986 将查询组件定义为:

query       = *( pchar / "/" / "?" )
pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~" 

百分比编码机制用于表示数据八位位组
当该八位位组的相应字符位于
允许设置或被用作分隔符或在其中
组件。

结论是XYZ部分应该编码:

special: # % = & ;
Space
sub-delims
out of query set: [ ]
non ASCII encodable characters

除非特殊符号 = & ;是key=value分隔符。

允许对其他字符进行编码,但不是必需的。

The question asks which characters are allowed in GET parameters without encoding or escaping them.

According to RFC3986 (general URL syntax) and RFC7230, section 2.7.1 (HTTP/S URL syntax) the only characters you need to percent-encode are those outside of the query set, see the definition below.

However, there are additional specifications like HTML5, Web forms, and the obsolete Indexed search, W3C recommendation. Those documents add a special meaning to some characters notably, to symbols like = & + ;.

Other answers here suggest that most of the reserved characters should be encoded, including "/" "?". That's not correct. In fact, RFC3986, section 3.4 advises against percent-encoding "/" "?" characters.

it is sometimes better for usability to avoid percent-
encoding those characters.

RFC3986 defines query component as:

query       = *( pchar / "/" / "?" )
pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims  = "!" / "
quot; / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~" 

A percent-encoding mechanism is used to represent a data octet in a
component when that octet's corresponding character is outside the
allowed set or is being used as a delimiter of, or within, the
component.

The conclusion is that XYZ part should encode:

special: # % = & ;
Space
sub-delims
out of query set: [ ]
non ASCII encodable characters

Unless special symbols = & ; are key=value separators.

Encoding other characters is allowed but not necessary.

风蛊 2024-08-12 12:53:41

所有有关 URI(包含 URN 和 URL)编码的规则都在 RFC1738 和 RFC3986 中指定,这里是这些冗长乏味文档的 TL;DR:

百分比编码,也称为 URL 编码,是一种机制用于在某些情况下对 URI 中的信息进行编码。 URI 中允许的字符可以是保留的,也可以是非保留的。保留字符是那些有时具有特殊含义的字符,但它们不是唯一需要编码的字符。

有 66 个非保留字符不需要任何编码:
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.~

需要编码的保留字符有18个:!*'();:@&=+$,/?#[],所有其他字符都必须进行编码。

要对字符进行百分比编码,只需将“%”与其十六进制的 ASCII 值连接起来即可。 php 函数 urlencoderawurlencode 可以为您完成这项工作,js 函数 encodeURIComponentencodeURI

All of the rules concerning the encoding of URIs (which contains URNs and URLs) are specified in the RFC1738 and the RFC3986, here's a TL;DR of these long and boring documents:

Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a URI under certain circumstances. The characters allowed in a URI are either reserved or unreserved. Reserved characters are those characters that sometimes have special meaning, but they are not the only characters that needs encoding.

There are 66 unreserved characters that doesn't need any encoding:
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.~

There are 18 reserved characters which needs to be encoded: !*'();:@&=+$,/?#[], and all the other characters must be encoded.

To percent-encode a character, simply concatenate "%" and its ASCII value in hexadecimal. The php functions urlencode and rawurlencode do this job for you, as well as the js functions encodeURIComponent and encodeURI.

酒绊 2024-08-12 12:53:41

我使用 Chrome 地址栏和 bash 中的 $QUERY_STRING 进行了测试,并观察到以下内容:

~!@$%^&*()-_=+[{]} \|;:',./?grave (backtick) 作为明文传递。

"<> 转换为 %20、< code>%22、%3C%3E 分别

被忽略,因为它被 ye olde anchor

就我个人而言,我会说硬着头皮使用 base64 进行编码:)

I did a test using the Chrome address bar and a $QUERY_STRING in bash, and observed the following:

~!@$%^&*()-_=+[{]}\|;:',./? and grave (backtick) are passed through as plaintext.

, ", < and > are converted to %20, %22, %3C and %3E respectively.

# is ignored, since it is used by ye olde anchor.

Personally, I'd say bite the bullet and encode with base64 :)

毁虫ゝ 2024-08-12 12:53:41

来自 RFC 1738,其中 URL 中允许使用字符:

仅限字母数字、特殊字符“$-_.+!*'(),”和
可以使用用于其保留目的的保留字符
URL 中未编码。

保留字符为“;”、“/”、“?”、“:”、“@”、“=”和“&”,这意味着如果您想使用它们,则需要对它们进行 URL 编码。

From RFC 1738 on which characters are allowed in URLs:

Only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

The reserved characters are ";", "/", "?", ":", "@", "=" and "&", which means you would need to URL encode them if you wish to use them.

握住你手 2024-08-12 12:53:41

字母数字字符和所有

~ - _ . ! *< /code> ' ( ) ,

在 URL 中有效。

所有其他字符都必须进行编码。

Alphanumeric characters and all of

~ - _ . ! * ' ( ) ,

are valid within an URL.

All other characters must be encoded.

泛泛之交 2024-08-12 12:53:41

<代码>“。” | “!” | “~”| “*”| “'”| “(” | “)” 也是可接受的[RFC2396]。实际上,如果编码正确,任何内容都可以包含在 GET 参数中。

"." | "!" | "~" | "*" | "'" | "(" | ")" are also acceptable [RFC2396]. Really, anything can be in a GET parameter if it is properly encoded.

颜漓半夏 2024-08-12 12:53:41

传递特殊字符时出现错误,无法解码该值,因此您可以使用encodeURIComponent
例如,如果我将问题解决为

updateUrl = updateUrl.replace(
                "SEARCH_TEXT",
                encodeURIComponent(JSON.stringify(searchText))
              );

When passing special character there is a error unable to decode the value so you can use encodeURIComponent
for example if I resolved my issue as

updateUrl = updateUrl.replace(
                "SEARCH_TEXT",
                encodeURIComponent(JSON.stringify(searchText))
              );
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文