Java 中的百分比编码空格问题

发布于 2024-08-26 21:55:47 字数 4039 浏览 3 评论 0原文

我正在使用 W3C 中的 URLUTF8Encoder.java 类 (www.w3.org/International/URLUTF8Encoder.java ）。

目前，它将所有空格“ ”编码为加号“+”。

我在修改代码以将空格百分比编码为“%20”时遇到困难。不幸的是，我对十六进制不太熟悉。有人可以帮我吗？我需要修改这个片段......

else if (ch == ' ') { // space
                sbuf.append('+');

在以下代码中：

final static String[] hex = { "%00", "%01", "%02", "%03", "%04", "%05",
            "%06", "%07", "%08", "%09", "%0A", "%0B", "%0C", "%0D", "%0E",
            "%0F", "%10", "%11", "%12", "%13", "%14", "%15", "%16", "%17",
            "%18", "%19", "%1A", "%1B", "%1C", "%1D", "%1E", "%1F", "%20",
            "%21", "%22", "%23", "%24", "%25", "%26", "%27", "%28", "%29",
            "%2A", "%2B", "%2C", "%2D", "%2E", "%2F", "%30", "%31", "%32",
            "%33", "%34", "%35", "%36", "%37", "%38", "%39", "%3A", "%3B",
            "%3C", "%3D", "%3E", "%3F", "%40", "%41", "%42", "%43", "%44",
            "%45", "%46", "%47", "%48", "%49", "%4A", "%4B", "%4C", "%4D",
            "%4E", "%4F", "%50", "%51", "%52", "%53", "%54", "%55", "%56",
            "%57", "%58", "%59", "%5A", "%5B", "%5C", "%5D", "%5E", "%5F",
            "%60", "%61", "%62", "%63", "%64", "%65", "%66", "%67", "%68",
            "%69", "%6A", "%6B", "%6C", "%6D", "%6E", "%6F", "%70", "%71",
            "%72", "%73", "%74", "%75", "%76", "%77", "%78", "%79", "%7A",
            "%7B", "%7C", "%7D", "%7E", "%7F", "%80", "%81", "%82", "%83",
            "%84", "%85", "%86", "%87", "%88", "%89", "%8A", "%8B", "%8C",
            "%8D", "%8E", "%8F", "%90", "%91", "%92", "%93", "%94", "%95",
            "%96", "%97", "%98", "%99", "%9A", "%9B", "%9C", "%9D", "%9E",
            "%9F", "%A0", "%A1", "%A2", "%A3", "%A4", "%A5", "%A6", "%A7",
            "%A8", "%A9", "%AA", "%AB", "%AC", "%AD", "%AE", "%AF", "%B0",
            "%B1", "%B2", "%B3", "%B4", "%B5", "%B6", "%B7", "%B8", "%B9",
            "%BA", "%BB", "%BC", "%BD", "%BE", "%BF", "%C0", "%C1", "%C2",
            "%C3", "%C4", "%C5", "%C6", "%C7", "%C8", "%C9", "%CA", "%CB",
            "%CC", "%CD", "%CE", "%CF", "%D0", "%D1", "%D2", "%D3", "%D4",
            "%D5", "%D6", "%D7", "%D8", "%D9", "%DA", "%DB", "%DC", "%DD",
            "%DE", "%DF", "%E0", "%E1", "%E2", "%E3", "%E4", "%E5", "%E6",
            "%E7", "%E8", "%E9", "%EA", "%EB", "%EC", "%ED", "%EE", "%EF",
            "%F0", "%F1", "%F2", "%F3", "%F4", "%F5", "%F6", "%F7", "%F8",
            "%F9", "%FA", "%FB", "%FC", "%FD", "%FE", "%FF" };

public static String encode(String s) {
        StringBuffer sbuf = new StringBuffer();
        int len = s.length();
        for (int i = 0; i < len; i++) {
            int ch = s.charAt(i);
            if ('A' <= ch && ch <= 'Z') { // 'A'..'Z'
                sbuf.append((char) ch);
            } else if ('a' <= ch && ch <= 'z') { // 'a'..'z'
                sbuf.append((char) ch);
            } else if ('0' <= ch && ch <= '9') { // '0'..'9'
                sbuf.append((char) ch);
            } else if (ch == ' ') { // space
                sbuf.append('+');
            } else if (ch == '-'
                    || ch == '_' // unreserved
                    || ch == '.' || ch == '!' || ch == '~' || ch == '*'
                    || ch == '\'' || ch == '(' || ch == ')') {
                sbuf.append((char) ch);
            } else if (ch <= 0x007f) { // other ASCII
                sbuf.append(hex[ch]);
            } else if (ch <= 0x07FF) { // non-ASCII <= 0x7FF
                sbuf.append(hex[0xc0 | (ch >> 6)]);
                sbuf.append(hex[0x80 | (ch & 0x3F)]);
            } else { // 0x7FF < ch <= 0xFFFF
                sbuf.append(hex[0xe0 | (ch >> 12)]);
                sbuf.append(hex[0x80 | ((ch >> 6) & 0x3F)]);
                sbuf.append(hex[0x80 | (ch & 0x3F)]);
            }
        }
        return sbuf.toString();
    }

谢谢！

原文

I am using the URLUTF8Encoder.java class from W3C (www.w3.org/International/URLUTF8Encoder.java).

Currently, it will encode any blank spaces ' ' into plus signs '+'.

I am having difficulty modifying the code to percent-encode the blank space into '%20'. Unfortunately, I am not too familiar with hex. Can anyone help me out? I need to modify this snippet...

else if (ch == ' ') { // space
                sbuf.append('+');

in the following code:

final static String[] hex = { "%00", "%01", "%02", "%03", "%04", "%05",
            "%06", "%07", "%08", "%09", "%0A", "%0B", "%0C", "%0D", "%0E",
            "%0F", "%10", "%11", "%12", "%13", "%14", "%15", "%16", "%17",
            "%18", "%19", "%1A", "%1B", "%1C", "%1D", "%1E", "%1F", "%20",
            "%21", "%22", "%23", "%24", "%25", "%26", "%27", "%28", "%29",
            "%2A", "%2B", "%2C", "%2D", "%2E", "%2F", "%30", "%31", "%32",
            "%33", "%34", "%35", "%36", "%37", "%38", "%39", "%3A", "%3B",
            "%3C", "%3D", "%3E", "%3F", "%40", "%41", "%42", "%43", "%44",
            "%45", "%46", "%47", "%48", "%49", "%4A", "%4B", "%4C", "%4D",
            "%4E", "%4F", "%50", "%51", "%52", "%53", "%54", "%55", "%56",
            "%57", "%58", "%59", "%5A", "%5B", "%5C", "%5D", "%5E", "%5F",
            "%60", "%61", "%62", "%63", "%64", "%65", "%66", "%67", "%68",
            "%69", "%6A", "%6B", "%6C", "%6D", "%6E", "%6F", "%70", "%71",
            "%72", "%73", "%74", "%75", "%76", "%77", "%78", "%79", "%7A",
            "%7B", "%7C", "%7D", "%7E", "%7F", "%80", "%81", "%82", "%83",
            "%84", "%85", "%86", "%87", "%88", "%89", "%8A", "%8B", "%8C",
            "%8D", "%8E", "%8F", "%90", "%91", "%92", "%93", "%94", "%95",
            "%96", "%97", "%98", "%99", "%9A", "%9B", "%9C", "%9D", "%9E",
            "%9F", "%A0", "%A1", "%A2", "%A3", "%A4", "%A5", "%A6", "%A7",
            "%A8", "%A9", "%AA", "%AB", "%AC", "%AD", "%AE", "%AF", "%B0",
            "%B1", "%B2", "%B3", "%B4", "%B5", "%B6", "%B7", "%B8", "%B9",
            "%BA", "%BB", "%BC", "%BD", "%BE", "%BF", "%C0", "%C1", "%C2",
            "%C3", "%C4", "%C5", "%C6", "%C7", "%C8", "%C9", "%CA", "%CB",
            "%CC", "%CD", "%CE", "%CF", "%D0", "%D1", "%D2", "%D3", "%D4",
            "%D5", "%D6", "%D7", "%D8", "%D9", "%DA", "%DB", "%DC", "%DD",
            "%DE", "%DF", "%E0", "%E1", "%E2", "%E3", "%E4", "%E5", "%E6",
            "%E7", "%E8", "%E9", "%EA", "%EB", "%EC", "%ED", "%EE", "%EF",
            "%F0", "%F1", "%F2", "%F3", "%F4", "%F5", "%F6", "%F7", "%F8",
            "%F9", "%FA", "%FB", "%FC", "%FD", "%FE", "%FF" };

public static String encode(String s) {
        StringBuffer sbuf = new StringBuffer();
        int len = s.length();
        for (int i = 0; i < len; i++) {
            int ch = s.charAt(i);
            if ('A' <= ch && ch <= 'Z') { // 'A'..'Z'
                sbuf.append((char) ch);
            } else if ('a' <= ch && ch <= 'z') { // 'a'..'z'
                sbuf.append((char) ch);
            } else if ('0' <= ch && ch <= '9') { // '0'..'9'
                sbuf.append((char) ch);
            } else if (ch == ' ') { // space
                sbuf.append('+');
            } else if (ch == '-'
                    || ch == '_' // unreserved
                    || ch == '.' || ch == '!' || ch == '~' || ch == '*'
                    || ch == '\'' || ch == '(' || ch == ')') {
                sbuf.append((char) ch);
            } else if (ch <= 0x007f) { // other ASCII
                sbuf.append(hex[ch]);
            } else if (ch <= 0x07FF) { // non-ASCII <= 0x7FF
                sbuf.append(hex[0xc0 | (ch >> 6)]);
                sbuf.append(hex[0x80 | (ch & 0x3F)]);
            } else { // 0x7FF < ch <= 0xFFFF
                sbuf.append(hex[0xe0 | (ch >> 12)]);
                sbuf.append(hex[0x80 | ((ch >> 6) & 0x3F)]);
                sbuf.append(hex[0x80 | (ch & 0x3F)]);
            }
        }
        return sbuf.toString();
    }

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

茶花眉 2024-09-02 21:55:47

您可能想查看 Apache Common 的编解码器包，它可能更强大：http://commons.apache。 org/codec/ - 您使用的包大约已有 14 年历史，仅编码为一种编码类型 (www-url-form-encoded) - 需要将空格编码为“+”。如果您尝试执行标准 URL 编码（需要空格为 %20），则需要完全使用不同的包。

回复收藏 0 原文

咿呀咿呀哟 2024-09-02 21:55:47

为什么使用此类而不是 API 方法？

java.net.URLEncoder.encode("your string", "utf-8");

为什么空格被编码为 + 字符会出现问题？这正是 URL 安全字符编码的工作原理。

回复收藏 0 原文

半山落雨半山空 2024-09-02 21:55:47

我不会问你为什么这样做，直接回答你的问题。请阅读其他答案以确定您是否确实想要修改此代码。如果您只是删除代码：

else if (ch == ' ') { // space
   sbuf.append('+');
}

它将执行您想要的操作，因为空格字符将由代码处理：

else if (ch <= 0x007f) { // other ASCII
   sbuf.append(hex[ch]);
}

I won't ask why you're doing this, and just answer your question directly. Please read other answers to determine if you really want to be modifying this code. If you just remove the code:

else if (ch == ' ') { // space
   sbuf.append('+');
}

It will do what you want, because the space character will be taken care of by the code:

else if (ch <= 0x007f) { // other ASCII
   sbuf.append(hex[ch]);
}

回复收藏 0 原文

岁吢 2024-09-02 21:55:47

只需这样做：

String str = "Hello World+You";
String encodedStr = URLEncoder.encode(str, "UTF-8");
encodedStr = encodedStr.replace("+", "%20");
System.out.println("Encoded String: " + encodedStr);

Just do this:

String str = "Hello World+You";
String encodedStr = URLEncoder.encode(str, "UTF-8");
encodedStr = encodedStr.replace("+", "%20");
System.out.println("Encoded String: " + encodedStr);

回复收藏 0 原文

请你别敷衍 2024-09-02 21:55:47

您可以使用内置的 java.lang. net.URI 类，通常通过其静态构建器作为 URI.create("http://example.com/search?param=42") 使用，但是如果参数包含文字空间，您可以将其用作：

URI uri = new URI("http", // scheme
    null,                 // user authentication info
    "example.com",        // domain
    -1,                   // port (use -1 for default port 80)
    "/search",            // path
    "param=four and two", // one or more parameters
    null);                // fragment (appended with the # char)
System.out.println(uri)
// OUTPUT:
// http://example.com/search?param=four%20and%20two

如果你查看这个特定的URI 构造函数你会看到 < code>-1 可用于指定默认端口（80）；显式传递 80 作为构造函数值将创建一个类似于 http://example.com:80/search?param=four%20and%20two 的 URL，您可能会这样做不想要。

相同的构造函数可用于仅构建 URL 的查询部分，您可以将其附加到现有字符串：

URI uri2 = new URI(null, null, null, -1, null, "param=four and two", null);
System.out.println(uri2)
// OUTPUT:
// ?param=four%20and%20two

^{可能值得一提的是 URI 与 URL 不同：file:/// 是有效的 URI，但不是有效的 URL。}

You can use the built-in java.net.URI class, which is normally used via it's static builder as URI.create("http://example.com/search?param=42") but in case when a parameter contains literal space you can use it as:

URI uri = new URI("http", // scheme
    null,                 // user authentication info
    "example.com",        // domain
    -1,                   // port (use -1 for default port 80)
    "/search",            // path
    "param=four and two", // one or more parameters
    null);                // fragment (appended with the # char)
System.out.println(uri)
// OUTPUT:
// http://example.com/search?param=four%20and%20two

If you look inside this particular URI constructor you'll see that -1 can be used to specify the default port (80); explicitly passing 80 as constructor value will create a URL like http://example.com:80/search?param=four%20and%20two which you probably do not want.

The same constructor can be used to build only the query part of the URL which you can append to an existing string:

URI uri2 = new URI(null, null, null, -1, null, "param=four and two", null);
System.out.println(uri2)
// OUTPUT:
// ?param=four%20and%20two

^{Might be worth mentioning that a URI is not the same as a URL: file:/// is a valid URI but not a valid URL.}

回复收藏 0 原文

鱼窥荷 2024-09-02 21:55:47

它工作正常；它应该与 + 一起使用，就像与 %20 一起使用一样。

也许尝试java.net.URLEncoder("url", "UTF-8")？

回复收藏 0 原文

~没有更多了~

关于作者

陌伤浅笑

暂无简介

0 文章

0 评论

20 人气

关注发私信

友情链接

文江博客

Java 中的百分比编码空格问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

Java 中的百分比编码空格问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。