Java URI 类:构造函数确定查询是否被编码?

发布于 2024-11-04 01:22:40 字数 708 浏览 3 评论 0原文

这种行为是故意的吗?

//create the same URI using two different constructors

URI foo = null, bar = null;
try { 
    //constructor: URI(uri string)
    foo = new URI("http://localhost/index.php?token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB");
} catch (URISyntaxException e) {} 
try { 
    //constructor: URI(scheme, authority, path, query, fragment) 
    bar = new URI("http", "localhost", "/index.php", "token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB", null);
} catch (URISyntaxException e) {}

//the output:
//foo.getQuery() = token=4/4EzdsSBg_4vX6D5pzvdsMLDoyItB
//bar.getQuery() = token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB

URI(string uri) 构造函数似乎正在解码 URI 的查询部分。我认为查询部分应该被编码?为什么另一个构造函数不解码查询部分?

Is this behavior intentional?

//create the same URI using two different constructors

URI foo = null, bar = null;
try { 
    //constructor: URI(uri string)
    foo = new URI("http://localhost/index.php?token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB");
} catch (URISyntaxException e) {} 
try { 
    //constructor: URI(scheme, authority, path, query, fragment) 
    bar = new URI("http", "localhost", "/index.php", "token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB", null);
} catch (URISyntaxException e) {}

//the output:
//foo.getQuery() = token=4/4EzdsSBg_4vX6D5pzvdsMLDoyItB
//bar.getQuery() = token=4%2F4EzdsSBg_4vX6D5pzvdsMLDoyItB

The URI(string uri) constructor seems to be decoding the query part of the URI. I thought the query portion is supposed to be encoded? And why doesn't the other constructor decode the query part?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

末蓝 2024-11-11 01:22:40

来自 URI JavaDoc :

单参数构造函数要求引用其参数中的任何非法字符,并保留任何转义的八位字节和其他存在的字符。

多参数构造函数根据它们出现的组件的要求引用非法字符。这些构造函数始终引用百分号字符 ('%')。保留任何其他字符。

因此 URI(String) 希望您正确编码所有内容,并假设 %2F 是这样一个编码的八位字节,它将被解码为 /

其他构造函数将对 % 字符进行结束编码(对于输入 %2F 产生 %252F),因此在解码后您仍然会得到 % 2F。

我假设构造函数之间存在偏差的目的是允许 new URI(otherUri.toString())toString() 返回完全编码的 URI。

From the URI JavaDoc:

The single-argument constructor requires any illegal characters in its argument to be quoted and preserves any escaped octets and other characters that are present.

The multi-argument constructors quote illegal characters as required by the components in which they appear. The percent character ('%') is always quoted by these constructors. Any other characters are preserved.

Thus URI(String) expects you to encode everything correctly and assumes %2F is such an encoded octed which will be decoded to /.

The other constructors would endcode the % character (resulting in %252F for input %2F) and thus after decoding you still get %2F.

I assume the purpose of the deviation between the constructors is to allow things like new URI(otherUri.toString()) with toString() returning a fully encoded URI.

筱武穆 2024-11-11 01:22:40

快速分析:

foo

构造函数解析输入 URI,并将文本 %2F 取消引用为 /。这就是我们所期望的。

bar

对于 bar 示例中使用的构造函数,fragment 部分被视为包含非法字符的原始字符串,并首先编码,效果如下%2F 被转换为 %252F然后它被解析,现在未加引号的查询部分(再次)是%2F

经验教训:通过第一个构造函数,我们传递了一个符合 RFC 2396 的 URI。其他构造函数采用正常字符串(不带引号的非法字符),并且 URI 构造符合 RFC 2396 的表示形式。

这是 IDEONE 上的一个工作示例(带有额外的支持输出)

A quick analysis:

foo

The constructor parses the input URI and unquotes the literal %2F to /. This is what we expect.

bar

With the constructor used in the bar example, the fragment part is taken as a raw String with illegal chars and encoded first, with the effect that %2F is translated to %252F. Then it is parsed and the now unquoted query part is (again) %2F.

Lesson learned: With the first constructor we pass an RFC 2396 compliant URI. The other constructors take normal Strings (unquoted illegal chars) and URI constructs an RFC 2396 compliant representation.

Here's a working example on IDEONE (with extra supporting output)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文