在android中解码以utf-8格式编码的字符串
我有一个来自 xml 的字符串,它是德语文本。德语特有的字符通过 UTF-8 格式进行编码。在显示字符串之前,我需要对其进行解码。
我尝试过以下方法:
try {
BufferedReader in = new BufferedReader(
new InputStreamReader(
new ByteArrayInputStream(nodevalue.getBytes()), "UTF8"));
event.attributes.put("title", in.readLine());
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
我也尝试过这个:
try {
event.attributes.put("title", URLDecoder.decode(nodevalue, "UTF-8"));
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
它们都不起作用。我如何解码德语字符串
提前谢谢你。
UDP日期:
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
super.characters(ch, start, length);
if (nodename != null) {
String nodevalue = String.copyValueOf(ch, 0, length);
if (nodename.equals("startdat")) {
if (event.attributes.get("eventid").equals("187")) {
}
}
if (nodename.equals("startscreen")) {
imageaddress = nodevalue;
}
else {
if (nodename.equals("title")) {
// try {
// BufferedReader in = new BufferedReader(
// new InputStreamReader(
// new ByteArrayInputStream(nodevalue.getBytes()), "UTF8"));
// event.attributes.put("title", in.readLine());
// } catch (UnsupportedEncodingException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// } catch (IOException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// }
// try {
// event.attributes.put("title",
// URLDecoder.decode(nodevalue, "UTF-8"));
// } catch (UnsupportedEncodingException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// }
event.attributes.put("title", StringEscapeUtils
.unescapeHtml(new String(ch, start, length).trim()));
} else
event.attributes.put(nodename, nodevalue);
}
}
}
I have a string which comes via an xml , and it is text in German. The characters that are German specific are encoded via the UTF-8 format. Before display the string I need to decode it.
I have tried the following:
try {
BufferedReader in = new BufferedReader(
new InputStreamReader(
new ByteArrayInputStream(nodevalue.getBytes()), "UTF8"));
event.attributes.put("title", in.readLine());
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I have also tried this:
try {
event.attributes.put("title", URLDecoder.decode(nodevalue, "UTF-8"));
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
None of them are working. How do I decode the German string
thank you in advance.
UDPDATE:
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
super.characters(ch, start, length);
if (nodename != null) {
String nodevalue = String.copyValueOf(ch, 0, length);
if (nodename.equals("startdat")) {
if (event.attributes.get("eventid").equals("187")) {
}
}
if (nodename.equals("startscreen")) {
imageaddress = nodevalue;
}
else {
if (nodename.equals("title")) {
// try {
// BufferedReader in = new BufferedReader(
// new InputStreamReader(
// new ByteArrayInputStream(nodevalue.getBytes()), "UTF8"));
// event.attributes.put("title", in.readLine());
// } catch (UnsupportedEncodingException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// } catch (IOException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// }
// try {
// event.attributes.put("title",
// URLDecoder.decode(nodevalue, "UTF-8"));
// } catch (UnsupportedEncodingException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// }
event.attributes.put("title", StringEscapeUtils
.unescapeHtml(new String(ch, start, length).trim()));
} else
event.attributes.put(nodename, nodevalue);
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以将 String 构造函数与 charset 参数一起使用:
另外,由于您从 xml 文档获取数据,并且我假设它是编码的 UTF-8,因此问题可能出在解析它时。
您应该使用
InputStream
/InputSource
而不是XMLReader
实现,因为它带有编码。因此,如果您从 http 响应中获取此数据,则可以同时使用InputStream
和InputSource
或仅使用
InputStream
:更新 1
以下是完整请求和响应处理的示例:
更新 2
由于问题不是编码,而是源 xml 被转义为 html 实体,所以最好的解决方案是(除了更正 php 以不转义响应),使用 apache.commons.lang 库 非常方便的
静态 StringEscapeUtils 类
。导入库后,在 xml 处理程序的
characters
方法中输入以下内容:更新 3
在最后的代码中,问题出在
nodevalue
的初始化上。代码>变量。应该是:You could use the String constructor with the charset parameter:
Also, since you get the data from an xml document, and I assume it is encoded UTF-8, probably the problem is in parsing it.
You should use
InputStream
/InputSource
instead of aXMLReader
implementation, because it comes with the encoding. So if you're getting this data from a http response, you could either use bothInputStream
andInputSource
or just the
InputStream
:Update 1
Here is a sample of a complete request and response handling:
Update 2
As the problem is not the encoding but the source xml being escaped to html entities, the best solution is (besides correcting the php to do not escape the response), to use the apache.commons.lang library's very handy
static StringEscapeUtils class
.After importing the library, in your xml handler's
characters
method you put the following:Update 3
In your last code the problem is with the initialization of the
nodevalue
variable. It should be: