Java URI 类:构造函数确定查询是否被编码?
这种行为是故意的吗?
//create the same URI using two different constructors
URI foo = null, bar = null;
try {
//constructor: URI(uri string)
foo = new URI("http://localhost/index.php?token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB");
} catch (URISyntaxException e) {}
try {
//constructor: URI(scheme, authority, path, query, fragment)
bar = new URI("http", "localhost", "/index.php", "token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB", null);
} catch (URISyntaxException e) {}
//the output:
//foo.getQuery() = token=4/4EzdsSBg_4vX6D5pzvdsMLDoyItB
//bar.getQuery() = token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB
URI(string uri) 构造函数似乎正在解码 URI 的查询部分。我认为查询部分应该被编码?为什么另一个构造函数不解码查询部分?
Is this behavior intentional?
//create the same URI using two different constructors
URI foo = null, bar = null;
try {
//constructor: URI(uri string)
foo = new URI("http://localhost/index.php?token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB");
} catch (URISyntaxException e) {}
try {
//constructor: URI(scheme, authority, path, query, fragment)
bar = new URI("http", "localhost", "/index.php", "token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB", null);
} catch (URISyntaxException e) {}
//the output:
//foo.getQuery() = token=4/4EzdsSBg_4vX6D5pzvdsMLDoyItB
//bar.getQuery() = token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB
The URI(string uri) constructor seems to be decoding the query part of the URI. I thought the query portion is supposed to be encoded? And why doesn't the other constructor decode the query part?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
来自 URI JavaDoc :
因此 URI(String) 希望您正确编码所有内容,并假设
%2F
是这样一个编码的八位字节,它将被解码为/
。其他构造函数将对
%
字符进行结束编码(对于输入%2F
产生%252F
),因此在解码后您仍然会得到% 2F。
我假设构造函数之间存在偏差的目的是允许
new URI(otherUri.toString())
和toString()
返回完全编码的 URI。From the URI JavaDoc:
Thus URI(String) expects you to encode everything correctly and assumes
%2F
is such an encoded octed which will be decoded to/
.The other constructors would endcode the
%
character (resulting in%252F
for input%2F
) and thus after decoding you still get%2F
.I assume the purpose of the deviation between the constructors is to allow things like
new URI(otherUri.toString())
withtoString()
returning a fully encoded URI.快速分析:
foo
构造函数解析输入 URI,并将文本
%2F
取消引用为/
。这就是我们所期望的。bar
对于 bar 示例中使用的构造函数,fragment 部分被视为包含非法字符的原始字符串,并首先编码,效果如下
%2F
被转换为%252F
。 然后它被解析,现在未加引号的查询部分(再次)是%2F
。经验教训:通过第一个构造函数,我们传递了一个符合 RFC 2396 的 URI。其他构造函数采用正常字符串(不带引号的非法字符),并且
URI
构造符合 RFC 2396 的表示形式。这是 IDEONE 上的一个工作示例(带有额外的支持输出)
A quick analysis:
foo
The constructor parses the input URI and unquotes the literal
%2F
to/
. This is what we expect.bar
With the constructor used in the bar example, the fragment part is taken as a raw String with illegal chars and encoded first, with the effect that
%2F
is translated to%252F
. Then it is parsed and the now unquoted query part is (again)%2F
.Lesson learned: With the first constructor we pass an RFC 2396 compliant URI. The other constructors take normal Strings (unquoted illegal chars) and
URI
constructs an RFC 2396 compliant representation.Here's a working example on IDEONE (with extra supporting output)