Java 中的百分比编码空格问题
我正在使用 W3C 中的 URLUTF8Encoder.java 类 (www.w3.org/International/URLUTF8Encoder.java )。
目前,它将所有空格“ ”编码为加号“+”。
我在修改代码以将空格百分比编码为“%20”时遇到困难。不幸的是,我对十六进制不太熟悉。有人可以帮我吗?我需要修改这个片段......
else if (ch == ' ') { // space
sbuf.append('+');
在以下代码中:
final static String[] hex = { "%00", "%01", "%02", "%03", "%04", "%05",
"%06", "%07", "%08", "%09", "%0A", "%0B", "%0C", "%0D", "%0E",
"%0F", "%10", "%11", "%12", "%13", "%14", "%15", "%16", "%17",
"%18", "%19", "%1A", "%1B", "%1C", "%1D", "%1E", "%1F", "%20",
"%21", "%22", "%23", "%24", "%25", "%26", "%27", "%28", "%29",
"%2A", "%2B", "%2C", "%2D", "%2E", "%2F", "%30", "%31", "%32",
"%33", "%34", "%35", "%36", "%37", "%38", "%39", "%3A", "%3B",
"%3C", "%3D", "%3E", "%3F", "%40", "%41", "%42", "%43", "%44",
"%45", "%46", "%47", "%48", "%49", "%4A", "%4B", "%4C", "%4D",
"%4E", "%4F", "%50", "%51", "%52", "%53", "%54", "%55", "%56",
"%57", "%58", "%59", "%5A", "%5B", "%5C", "%5D", "%5E", "%5F",
"%60", "%61", "%62", "%63", "%64", "%65", "%66", "%67", "%68",
"%69", "%6A", "%6B", "%6C", "%6D", "%6E", "%6F", "%70", "%71",
"%72", "%73", "%74", "%75", "%76", "%77", "%78", "%79", "%7A",
"%7B", "%7C", "%7D", "%7E", "%7F", "%80", "%81", "%82", "%83",
"%84", "%85", "%86", "%87", "%88", "%89", "%8A", "%8B", "%8C",
"%8D", "%8E", "%8F", "%90", "%91", "%92", "%93", "%94", "%95",
"%96", "%97", "%98", "%99", "%9A", "%9B", "%9C", "%9D", "%9E",
"%9F", "%A0", "%A1", "%A2", "%A3", "%A4", "%A5", "%A6", "%A7",
"%A8", "%A9", "%AA", "%AB", "%AC", "%AD", "%AE", "%AF", "%B0",
"%B1", "%B2", "%B3", "%B4", "%B5", "%B6", "%B7", "%B8", "%B9",
"%BA", "%BB", "%BC", "%BD", "%BE", "%BF", "%C0", "%C1", "%C2",
"%C3", "%C4", "%C5", "%C6", "%C7", "%C8", "%C9", "%CA", "%CB",
"%CC", "%CD", "%CE", "%CF", "%D0", "%D1", "%D2", "%D3", "%D4",
"%D5", "%D6", "%D7", "%D8", "%D9", "%DA", "%DB", "%DC", "%DD",
"%DE", "%DF", "%E0", "%E1", "%E2", "%E3", "%E4", "%E5", "%E6",
"%E7", "%E8", "%E9", "%EA", "%EB", "%EC", "%ED", "%EE", "%EF",
"%F0", "%F1", "%F2", "%F3", "%F4", "%F5", "%F6", "%F7", "%F8",
"%F9", "%FA", "%FB", "%FC", "%FD", "%FE", "%FF" };
public static String encode(String s) {
StringBuffer sbuf = new StringBuffer();
int len = s.length();
for (int i = 0; i < len; i++) {
int ch = s.charAt(i);
if ('A' <= ch && ch <= 'Z') { // 'A'..'Z'
sbuf.append((char) ch);
} else if ('a' <= ch && ch <= 'z') { // 'a'..'z'
sbuf.append((char) ch);
} else if ('0' <= ch && ch <= '9') { // '0'..'9'
sbuf.append((char) ch);
} else if (ch == ' ') { // space
sbuf.append('+');
} else if (ch == '-'
|| ch == '_' // unreserved
|| ch == '.' || ch == '!' || ch == '~' || ch == '*'
|| ch == '\'' || ch == '(' || ch == ')') {
sbuf.append((char) ch);
} else if (ch <= 0x007f) { // other ASCII
sbuf.append(hex[ch]);
} else if (ch <= 0x07FF) { // non-ASCII <= 0x7FF
sbuf.append(hex[0xc0 | (ch >> 6)]);
sbuf.append(hex[0x80 | (ch & 0x3F)]);
} else { // 0x7FF < ch <= 0xFFFF
sbuf.append(hex[0xe0 | (ch >> 12)]);
sbuf.append(hex[0x80 | ((ch >> 6) & 0x3F)]);
sbuf.append(hex[0x80 | (ch & 0x3F)]);
}
}
return sbuf.toString();
}
谢谢!
I am using the URLUTF8Encoder.java class from W3C (www.w3.org/International/URLUTF8Encoder.java).
Currently, it will encode any blank spaces ' ' into plus signs '+'.
I am having difficulty modifying the code to percent-encode the blank space into '%20'. Unfortunately, I am not too familiar with hex. Can anyone help me out? I need to modify this snippet...
else if (ch == ' ') { // space
sbuf.append('+');
in the following code:
final static String[] hex = { "%00", "%01", "%02", "%03", "%04", "%05",
"%06", "%07", "%08", "%09", "%0A", "%0B", "%0C", "%0D", "%0E",
"%0F", "%10", "%11", "%12", "%13", "%14", "%15", "%16", "%17",
"%18", "%19", "%1A", "%1B", "%1C", "%1D", "%1E", "%1F", "%20",
"%21", "%22", "%23", "%24", "%25", "%26", "%27", "%28", "%29",
"%2A", "%2B", "%2C", "%2D", "%2E", "%2F", "%30", "%31", "%32",
"%33", "%34", "%35", "%36", "%37", "%38", "%39", "%3A", "%3B",
"%3C", "%3D", "%3E", "%3F", "%40", "%41", "%42", "%43", "%44",
"%45", "%46", "%47", "%48", "%49", "%4A", "%4B", "%4C", "%4D",
"%4E", "%4F", "%50", "%51", "%52", "%53", "%54", "%55", "%56",
"%57", "%58", "%59", "%5A", "%5B", "%5C", "%5D", "%5E", "%5F",
"%60", "%61", "%62", "%63", "%64", "%65", "%66", "%67", "%68",
"%69", "%6A", "%6B", "%6C", "%6D", "%6E", "%6F", "%70", "%71",
"%72", "%73", "%74", "%75", "%76", "%77", "%78", "%79", "%7A",
"%7B", "%7C", "%7D", "%7E", "%7F", "%80", "%81", "%82", "%83",
"%84", "%85", "%86", "%87", "%88", "%89", "%8A", "%8B", "%8C",
"%8D", "%8E", "%8F", "%90", "%91", "%92", "%93", "%94", "%95",
"%96", "%97", "%98", "%99", "%9A", "%9B", "%9C", "%9D", "%9E",
"%9F", "%A0", "%A1", "%A2", "%A3", "%A4", "%A5", "%A6", "%A7",
"%A8", "%A9", "%AA", "%AB", "%AC", "%AD", "%AE", "%AF", "%B0",
"%B1", "%B2", "%B3", "%B4", "%B5", "%B6", "%B7", "%B8", "%B9",
"%BA", "%BB", "%BC", "%BD", "%BE", "%BF", "%C0", "%C1", "%C2",
"%C3", "%C4", "%C5", "%C6", "%C7", "%C8", "%C9", "%CA", "%CB",
"%CC", "%CD", "%CE", "%CF", "%D0", "%D1", "%D2", "%D3", "%D4",
"%D5", "%D6", "%D7", "%D8", "%D9", "%DA", "%DB", "%DC", "%DD",
"%DE", "%DF", "%E0", "%E1", "%E2", "%E3", "%E4", "%E5", "%E6",
"%E7", "%E8", "%E9", "%EA", "%EB", "%EC", "%ED", "%EE", "%EF",
"%F0", "%F1", "%F2", "%F3", "%F4", "%F5", "%F6", "%F7", "%F8",
"%F9", "%FA", "%FB", "%FC", "%FD", "%FE", "%FF" };
public static String encode(String s) {
StringBuffer sbuf = new StringBuffer();
int len = s.length();
for (int i = 0; i < len; i++) {
int ch = s.charAt(i);
if ('A' <= ch && ch <= 'Z') { // 'A'..'Z'
sbuf.append((char) ch);
} else if ('a' <= ch && ch <= 'z') { // 'a'..'z'
sbuf.append((char) ch);
} else if ('0' <= ch && ch <= '9') { // '0'..'9'
sbuf.append((char) ch);
} else if (ch == ' ') { // space
sbuf.append('+');
} else if (ch == '-'
|| ch == '_' // unreserved
|| ch == '.' || ch == '!' || ch == '~' || ch == '*'
|| ch == '\'' || ch == '(' || ch == ')') {
sbuf.append((char) ch);
} else if (ch <= 0x007f) { // other ASCII
sbuf.append(hex[ch]);
} else if (ch <= 0x07FF) { // non-ASCII <= 0x7FF
sbuf.append(hex[0xc0 | (ch >> 6)]);
sbuf.append(hex[0x80 | (ch & 0x3F)]);
} else { // 0x7FF < ch <= 0xFFFF
sbuf.append(hex[0xe0 | (ch >> 12)]);
sbuf.append(hex[0x80 | ((ch >> 6) & 0x3F)]);
sbuf.append(hex[0x80 | (ch & 0x3F)]);
}
}
return sbuf.toString();
}
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您可能想查看 Apache Common 的编解码器包,它可能更强大:http://commons.apache。 org/codec/ - 您使用的包大约已有 14 年历史,仅编码为一种编码类型 (www-url-form-encoded) - 需要将空格编码为“+”。如果您尝试执行标准 URL 编码(需要空格为 %20),则需要完全使用不同的包。
You might want to check out Apache Common's codec package, it's probably a lot more robust : http://commons.apache.org/codec/ - The package you're using is about 14 years old and only encodes into one type of encoding (www-url-form-encoded) - which REQUIRES spaces to be encoded as '+'. If you're trying do do standard URL encoding (which wants spaces as %20), you'll need to use a different package entirely.
为什么使用此类而不是 API 方法?
java.net.URLEncoder.encode("your string", "utf-8");
为什么空格被编码为 + 字符会出现问题?这正是 URL 安全字符编码的工作原理。
Why are you using this class instead of the API method?
java.net.URLEncoder.encode("your string", "utf-8");
And why is it a problem that spaces are encoded as + characters? That is exactly how URL safe character encoding is supposed to work.
我不会问你为什么这样做,直接回答你的问题。请阅读其他答案以确定您是否确实想要修改此代码。如果您只是删除代码:
它将执行您想要的操作,因为空格字符将由代码处理:
I won't ask why you're doing this, and just answer your question directly. Please read other answers to determine if you really want to be modifying this code. If you just remove the code:
It will do what you want, because the space character will be taken care of by the code:
只需这样做:
Just do this:
您可以使用内置的
java.lang. net.URI
类,通常通过其静态构建器作为URI.create("http://example.com/search?param=42")
使用,但是如果参数包含文字空间,您可以将其用作:相同的构造函数可用于仅构建 URL 的查询部分,您可以将其附加到现有字符串:
You can use the built-in
java.net.URI
class, which is normally used via it's static builder asURI.create("http://example.com/search?param=42")
but in case when a parameter contains literal space you can use it as:The same constructor can be used to build only the query part of the URL which you can append to an existing string:
它工作正常;它应该与 + 一起使用,就像与 %20 一起使用一样。
也许尝试
java.net.URLEncoder("url", "UTF-8")
?It's working correctly; it should work with + as well as it would with %20.
Maybe try
java.net.URLEncoder("url", "UTF-8")
?