android dom解析器问题
我有这个 RSS 提要要解析,其中包含多个标签。我能够检索除描述标签节点之外的所有节点的值(子元素)。请在 rss feed 下面找到
<fflag>0</fflag>
<tflag>0</tflag>
<ens1:org>C Opera Production</ens1:org>
−
<description>
<p>Opera to be announced</p>
<p>$15 adults/$12 seniors/$10 for college students<span style="white-space: pre;"> </span></p>
</description>
我为此使用的代码,
StringBuffer descriptionAccumulator = new StringBuffer();
else if (property.getNodeName().equals("description")){
try{
String desc = (property.getFirstChild().getNodeValue());
if(property.getNodeName().equals("p")){
descriptionAccumulator.append(property.getFirstChild().getNodeValue());
}
}
catch(Exception e){
Log.i(tag, "No desc");
}
else if (property.getNodeName().equals("ens1:org")){
try{
event.setOrganization(property.getFirstChild().getNodeValue());
Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
}
catch(Exception e){
}
else if (property.getNodeName().equals("area")||property.getNodeName().equals("fflag") || property.getNodeName().equals("tflag") || property.getNodeName().equals("guid")){
try{
//event.setOrganization(property.getFirstChild().getNodeValue());
Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
}
catch(Exception e){
}
else if(property.getNodeName().equals("p") || property.getNodeName().equals("em") || property.getNodeName().equals("br") || property.getNodeName().startsWith("em") || property.getNodeName().startsWith("span") || property.getNodeName().startsWith("a") || property.getNodeName().startsWith("div") || property.getNodeName().equals("div") || property.getNodeName().startsWith("p")){
descriptionAccumulator.append(property.getFirstChild().getNodeValue());
descriptionAccumulator.append(".");
System.out.println("description added:"+descriptionAccumulator);
Log.i("Description",descriptionAccumulator+property.getFirstChild().getNodeValue());
}
我尝试捕获
标记的值,但效果不佳,所以我尝试使用所有常用的 html 格式标记都用了但是还是没有办法。使用任何其他解析器都不是一个选择。有人可以帮我解决这个问题吗?谢谢
i have this rss feed to parse that contains several tags. i am able to retrieve the value (child element) for all except for the description tag node. please find below the rss feed
<fflag>0</fflag>
<tflag>0</tflag>
<ens1:org>C Opera Production</ens1:org>
−
<description>
<p>Opera to be announced</p>
<p>$15 adults/$12 seniors/$10 for college students<span style="white-space: pre;"> </span></p>
</description>
the code that i am using for this is
StringBuffer descriptionAccumulator = new StringBuffer();
else if (property.getNodeName().equals("description")){
try{
String desc = (property.getFirstChild().getNodeValue());
if(property.getNodeName().equals("p")){
descriptionAccumulator.append(property.getFirstChild().getNodeValue());
}
}
catch(Exception e){
Log.i(tag, "No desc");
}
else if (property.getNodeName().equals("ens1:org")){
try{
event.setOrganization(property.getFirstChild().getNodeValue());
Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
}
catch(Exception e){
}
else if (property.getNodeName().equals("area")||property.getNodeName().equals("fflag") || property.getNodeName().equals("tflag") || property.getNodeName().equals("guid")){
try{
//event.setOrganization(property.getFirstChild().getNodeValue());
Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
}
catch(Exception e){
}
else if(property.getNodeName().equals("p") || property.getNodeName().equals("em") || property.getNodeName().equals("br") || property.getNodeName().startsWith("em") || property.getNodeName().startsWith("span") || property.getNodeName().startsWith("a") || property.getNodeName().startsWith("div") || property.getNodeName().equals("div") || property.getNodeName().startsWith("p")){
descriptionAccumulator.append(property.getFirstChild().getNodeValue());
descriptionAccumulator.append(".");
System.out.println("description added:"+descriptionAccumulator);
Log.i("Description",descriptionAccumulator+property.getFirstChild().getNodeValue());
}
I tried capturing the value of <description>
tag but that dint work out, so I tried using all the usual html formatting tags that are used but still no way out. using any other parser is not an option. could some body please help me out with this. thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我相信 rss xml 有问题。例如,检查 StackOverflow rss feed 返回的 xml。特别注意
节点内容的外观 - 它内部没有子 xml 节点,只有纯 xml 转义文本。因此,如果您的情况可以接受,请花精力生成正确的 rss xml,而不是修复后果。
I believe smth is wrong with the rss xml. For instance check what xml is returned by StackOverflow rss feed. Specifically pay attention how
<summary type="html">
node content looks like - it has no child xml nodes inside, only pure xml-escaped text. So if it is acceptable in your case - spend efforts on a proper rss xml generation rather than on fixing the consequences.您将其解析为 xml,因此描述标记没有字符串值,它有多个子级。您可以尝试获取描述节点并漂亮地打印它的子节点。请参阅 LSSerializer 以打印到 XML。
You are parsing this as xml, so the description tag doesn't have a string value, it has multiple children. You might try getting getting the description node and pretty printing it's children. See LSSerializer for printing to XML.