当响应包含印地语或其他特殊字符时,SAXParser 失败
我正在使用 SAX 解析器来解析 XML 响应,但它抛出异常。
ExpatParser$ParseException:(格式不正确)无效令牌
有什么解决方案吗?
这是我的代码:
HttpParams params = new BasicHttpParams();
HttpProtocolParams.setContentCharset(params, "UTF-8");
HttpPost postMethod = new HttpPost(MyRequestURL);
DefaultHttpClient hc = new DefaultHttpClient(params);
postMethod.setEntity(new UrlEncodedFormEntity(nameValuePairs));
ResponseHandler <String> res = new BasicResponseHandler();
String response=hc.execute(postMethodURL,res);
ByteArrayInputStream byteArrayInputStream =
new ByteArrayInputStream(response.getBytes("UTF8"));
/* SAXParser from the SAXPArserFactory. */
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
/* Get the XMLReader of the SAXParser we created. */
XMLReader xr = sp.getXMLReader();
/* Create a new ContentHandler and apply it to the XML-Reader*/
MyHandler objHandler = new MyHandler();
xr.setContentHandler(objHandler);
InputSource inputSource = new InputSource(byteArrayInputStream);
inputSource.setEncoding("UTF-8");
/* Parse the xml-data from our URL. */
xr.parse(inputSource);
/* Parsing has finished. */
I am using SAX parser to parse a XML response but it throws an exception.
ExpatParser$ParseException : (not well formed) invalid token
Is there any solution?
Here is my code:
HttpParams params = new BasicHttpParams();
HttpProtocolParams.setContentCharset(params, "UTF-8");
HttpPost postMethod = new HttpPost(MyRequestURL);
DefaultHttpClient hc = new DefaultHttpClient(params);
postMethod.setEntity(new UrlEncodedFormEntity(nameValuePairs));
ResponseHandler <String> res = new BasicResponseHandler();
String response=hc.execute(postMethodURL,res);
ByteArrayInputStream byteArrayInputStream =
new ByteArrayInputStream(response.getBytes("UTF8"));
/* SAXParser from the SAXPArserFactory. */
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
/* Get the XMLReader of the SAXParser we created. */
XMLReader xr = sp.getXMLReader();
/* Create a new ContentHandler and apply it to the XML-Reader*/
MyHandler objHandler = new MyHandler();
xr.setContentHandler(objHandler);
InputSource inputSource = new InputSource(byteArrayInputStream);
inputSource.setEncoding("UTF-8");
/* Parse the xml-data from our URL. */
xr.parse(inputSource);
/* Parsing has finished. */
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我不完全确定它会解决您的问题,但我会使用其
setEncoding()
方法在InputSource
上设置字符集。I'm not entirely sure that it will solve your problem but I'd set the charset on the
InputSource
using itssetEncoding()
method.尝试使用 android.util.Xml.parse()
第一个参数InputStream => HttpResponse.getEntity().getContent()
第二个参数 Xml.Encoding => Xml.Encoding.UTF_8
最后一个参数 ContentHandler =>你的经纪人
Try with android.util.Xml.parse()
First argument InputStream => HttpResponse.getEntity().getContent()
Second argument Xml.Encoding => Xml.Encoding.UTF_8
Last argument ContentHandler => your handler
这应该可以解决问题:
This should solve the problem:
第一个答案 &
符号 (&) 和左尖括号 (<) 不得以其文字形式出现在 xml 输出中,除非用作标记分隔符,或在注释、处理指令或 CDATA 部分中使用。如果其他地方需要它们,则必须分别使用数字字符引用或字符串“ & ”和“< ”对它们进行转义。
右尖括号 (>) 可以使用字符串“>;”表示,并且为了兼容性,必须使用“&”进行转义;gt; 或出现在内容中的字符串“ ]]> ”中的字符引用,且该字符串未标记 CDATA部分。
请检查您的 xml 是否包含这些特殊字符(&,<,>)
与 Vaibhav Jani 讨论后
这是示例 xml 文件
这是示例 XML 的 SAX 解析器
First Answer
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in your xml output in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings " & " and "< " respectively.
The right angle bracket (>) may be represented using the string " >; ", and MUST, for compatibility, be escaped using either " >; " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.
Please check your xml seems that it comes the these special characters(&,<,>)
After discussion with Vaibhav Jani
Here is the sample xml file
And this the SAX parser for the sample XML
你使用什么编码?
如果您使用 ISO-8859-1,请尝试使用 UTF-8
what encoding are you using?
if you are using ISO-8859-1, try using UTF-8