URL 中允许使用方括号吗?
URL 中允许使用方括号吗?
我注意到 Apache commons HttpClient (3.0.1) 抛出 IOException,然而 wget 和 Firefox 接受方括号。
URL 示例:
http://example.com/path/to/file[3].html
我的 HTTP 客户端遇到此类 URL,但我不确定是否要修补代码或引发异常(实际上应该如此)。
Are square brackets in URLs allowed?
I noticed that Apache commons HttpClient (3.0.1) throws an IOException, wget and Firefox however accept square brackets.
URL example:
http://example.com/path/to/file[3].html
My HTTP client encounters such URLs but I'm not sure whether to patch the code or to throw an exception (as it actually should be).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
URL 中的方括号
[
和]
通常不受支持。将它们替换为
%5B
和%5D
:使用命令行,以下示例基于
bash
和sed :
使用 Java
URLEncoder.encode(String s, String enc)
使用 PHP
rawurlencode( )
或urlencode ()
<前><代码>';
?>
输出:
或者:
<前><代码>';
?>
使用您最喜欢的编程语言...请通过发表评论或直接编辑此答案来扩展此答案,以添加您在编程语言中使用的功能; -)
有关更多详细信息,请参阅 RFC 3986 指定 URL 语法。 附录A是关于查询字符串中的
%-encoding
(属于“gen-delims”的括号为% -编码
)。Square brackets
[
and]
in URLs are not often supported.Replace them by
%5B
and%5D
:Using a command line, the following example is based on
bash
andsed
:Using Java
URLEncoder.encode(String s, String enc)
Using PHP
rawurlencode()
orurlencode()
output:
or:
Using your favorite programming language... Please extend this answer by posting a comment or editing directly this answer to add the function you use from your programming language ;-)
For more details, see the RFC 3986 specifying the URL syntax. The Appendix A is about
%-encoding
in the query string (brackets as belonging to “gen-delims” to be%-encoded
).路径名中唯一不允许使用的字符几乎是 # 和 ? 因为它们意味着路径的终点。
uri rfc 将有最终答案:
http://www.ietf.org/rfc/rfc1738。文本
答案是它们应该是十六进制编码的,但是知道Postel定律,大多数东西都会逐字接受它们。
Pretty much the only characters not allowed in pathnames are # and ? as they signify the end of the path.
The uri rfc will have the definative answer:
http://www.ietf.org/rfc/rfc1738.txt
The answer is that they should be hex encoded, but knowing postel's law, most things will accept them verbatim.
我知道这个问题有点老了,但我只是想指出 PHP 使用括号在 URL 中传递数组。
在这种情况下,
$_GET['bar']
将包含array(1, 2, 3)
。I know this question is a bit old, but I just wanted to note that PHP uses brackets to pass arrays in a URL.
In this case
$_GET['bar']
will containarray(1, 2, 3)
.StackOverflow 似乎没有对它们进行编码:
https://stackoverflow.com/search?q=square+brackets+[url]
StackOverflow seems to not encode them:
https://stackoverflow.com/search?q=square+brackets+[url]
要使用 HttpClient commons 类,您需要查看 org.apache.commons.httpclient.util.URIUtil 类,特别是encode() 方法。 在尝试获取 URL 之前,使用它对 URL 进行 URI 编码。
For using the HttpClient commons class, you want to look into the org.apache.commons.httpclient.util.URIUtil class, specifically the encode() method. Use it to URI-encode the URL before trying to fetch it.
任何接受 URL 并且在引入特殊字符时不抛出异常的浏览器或支持 Web 的软件几乎可以保证在幕后对特殊字符进行编码。 大括号、方括号、空格等都有特殊的编码方式来表示,以免产生冲突。 根据前面的答案,处理这些问题的最安全方法是在将它们交给尝试解析 URL 的东西之前对它们进行 URL 编码。
Any browser or web-enabled software that accepts URLs and is not throwing an exception when special characters are introduced is almost guaranteed to be encoding the special characters behind the scenes. Curly brackets, square brackets, spaces, etc all have special encoded ways of representing them so as not to produce conflicts. As per the previous answers, the safest way to deal with these is to URL-encode them before handing them off to something that will try to resolve the URL.
最好对它们进行 URL 编码,因为显然并非所有 Web 服务器都支持它们。 有时,即使有标准,也不是每个人都遵循它。
Best to URL encode those, as they are clearly not supported in all web servers. Sometimes, even when there is a standard, not everyone follows it.
根据 URL 规范,方括号不是有效的 URL 字符。
这是相关的片段:
According to the URL specification, the square brackets are not valid URL characters.
Here's the relevant snippets:
方括号被认为是不安全的,但大多数浏览器都会正确解析它们。 话虽如此,最好用一些其他字符替换方括号。
Square brackets are considered unsafe, but majority of browsers will parse those correctly. Having said that it is better to replace square brackets with some other characters.
RFC 3986 规定
因此理论上您不应该在野外看到这样的 URI,因为它们应该是经过编码的。
RFC 3986 states
So you should not be seeing such URI's in the wild in theory, as they should arrive encoded.