JDK 1.4 与 JDK 1.5 中的 java URI 变化

发布于 2024-11-26 14:10:41 字数 1147 浏览 3 评论 0原文

import java.net.*;

public class TestURI {
     public static void main(String args[]) throws URISyntaxException
     {
        String first = new String("foo");
        String second = new String("bar");
        String third = new String("[space or another space]");

        URI temp = new URI(first, second, third);
        System.out.println(temp.getFragment());

     }
}

当我在 JDK 1.4 中运行上述代码时,我得到

[space or another space]

当我在 JDK 1.5/1.6 中运行相同的代码时,我得到以下内容:

[space%20or%20another%20space]

有人可以告诉我发生了什么变化吗?

谢谢, Raj

编辑

如果我执行以下操作,它会起作用:

import java.net.*;

public class TestURI {
   public static void main(String args[]) throws URISyntaxException
   {
      String first = new String("foo");
      String second = new String("bar");
      String third = new String("[space or another space]").replaceAll("\\[", "leftSB").replaceAll("\\]", "rightSB");

      URI temp = new URI(first, second, third);
      System.out.println(temp.getFragment().replaceAll("leftSB", "\\[").replaceAll("rightSB", "\\]"));

   }
}
import java.net.*;

public class TestURI {
     public static void main(String args[]) throws URISyntaxException
     {
        String first = new String("foo");
        String second = new String("bar");
        String third = new String("[space or another space]");

        URI temp = new URI(first, second, third);
        System.out.println(temp.getFragment());

     }
}

When I run the above code in JDK 1.4, I get

[space or another space]

When I run the same code in JDK 1.5/1.6, I get the following:

[space%20or%20another%20space]

Could somebody tell me what changed?

Thanks,
Raj

Edit:

If I do something like the following, it works:

import java.net.*;

public class TestURI {
   public static void main(String args[]) throws URISyntaxException
   {
      String first = new String("foo");
      String second = new String("bar");
      String third = new String("[space or another space]").replaceAll("\\[", "leftSB").replaceAll("\\]", "rightSB");

      URI temp = new URI(first, second, third);
      System.out.println(temp.getFragment().replaceAll("leftSB", "\\[").replaceAll("rightSB", "\\]"));

   }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

愿得七秒忆 2024-12-03 14:10:41

看起来空格已被 URI 编码。

%20 是 ASCII 空格字符的十六进制格式。

我认为片段标识符中的空格是非法的,Java 1.4 中的实现并不知道这一点。

类文档,我强调:

RFC 2396 允许转义八位字节出现在用户信息、路径、查询和片段组件中。
URI 中的转义有两个目的:

  • 当 URI 需要严格遵守 RFC 2396 时对非 US-ASCII 字符进行编码
    不包含任何其他字符。

  • 引用组件中非法的字符。用户信息、路径、查询、
    和片段组件在哪些字符被视为合法方面略有不同
    和非法的。

此类中通过三个相关操作来实现这些目的:

  • 通过将字符替换为转义八位字节序列来对字符进行编码
    表示 UTF-8 字符集中的该字符。 [...]
  • 仅通过编码即可引用非法字符例如,空格字符是
    引用时将其替换为“%20”
    。 [...]
  • 通过将转义八位字节序列替换为它所包含的字符序列来解码
    代表UTF-8字符集。 [...]

这些操作在此类的构造函数和方法中公开,如下所示:

  • 单参数构造函数 [...]

  • 多参数构造函数根据组件的要求引用非法字符
    他们出现在其中。这些构造函数始终引用百分号字符 ('%')。
    保留任何其他字符。

  • ...
  • getUserInfogetPathgetQuerygetFragmentgetAuthority获取SchemeSpecificPart
    方法解码其相应组件中的任何转义八位字节。这些返回的字符串
    方法可能包含其他字符和非法字符,并且不会包含任何转义字符
    八位字节。

之后您将使用三参数构造函数和 getFragment 方法。看起来应该再次解码空格,但事实并非如此。这可能是一个错误,但 Sun Bug 数据库现在似乎已离线,所以我无法真正检查这一点。

It looks like the spaces got URI-encoded.

%20 is the hexadecimal formatting of the ASCII space character.

I suppose spaces are illegal in the fragment identifier, which the implementation in Java 1.4 did not know.

From the class documentation, emphasis by me:

RFC 2396 allows escaped octets to appear in the user-info, path, query, and fragment components.
Escaping serves two purposes in URIs:

  • To encode non-US-ASCII characters when a URI is required to conform strictly to RFC 2396
    by not containing any other characters.

  • To quote characters that are otherwise illegal in a component. The user-info, path, query,
    and fragment components differ slightly in terms of which characters are considered legal
    and illegal.

These purposes are served in this class by three related operations:

  • A character is encoded by replacing it with the sequence of escaped octets that
    represent that character in the UTF-8 character set. [...]
  • An illegal character is quoted simply by encoding it. The space character, for example, is
    quoted by replacing it with "%20"
    . [...]
  • A sequence of escaped octets is decoded by replacing it with the sequence of characters that it
    represents in the UTF-8 character set. [...]

These operations are exposed in the constructors and methods of this class as follows:

  • The single-argument constructor [...]

  • The multi-argument constructors quote illegal characters as required by the components
    in which they appear. The percent character ('%') is always quoted by these constructors.
    Any other characters are preserved.

  • ...
  • The getUserInfo, getPath, getQuery, getFragment, getAuthority, and getSchemeSpecificPart
    methods decode any escaped octets in their corresponding components. The strings returned by these
    methods may contain both other characters and illegal characters, and will not contain any escaped
    octets.

You are using the three-argument constructor and the getFragment method afterwards. It looks like it should decode the spaces again, but it does not. This could be a bug, but the Sun Bug database seems to be offline now, so I can't really check this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文