Stax 问题解析具有结束元素和开始元素在同一行的文档
我有以下代码,用于使用 Stax 将 XML 文件的元素转换为字符串:
private static XMLStreamReader getReader(InputStream inputStream) throws XMLStreamException {
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
xmlInputFactory.setProperty("javax.xml.stream.isValidating", false);
xmlInputFactory.setProperty("javax.xml.stream.supportDTD", false);
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
return xmlStreamReader;
}
private static String readElement(XMLStreamReader reader) throws XMLStreamException, TransformerException {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
StAXSource source = new StAXSource(reader);
t.transform(source, new StreamResult(outputStream));
return outputStream.toString();
}
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("c:\\temp\\test.xml");
XMLStreamReader xmlStreamReader = getReader(inputStream);
int count = 0;
while (xmlStreamReader.hasNext()) {
int eventType = xmlStreamReader.next();
if (eventType == XMLEvent.START_ELEMENT) {
String elementName = xmlStreamReader.getName().getLocalPart();
if (!elementName.toLowerCase().equals("element")) {
continue;
}
String productStr = readElement(xmlStreamReader);
System.out.println(productStr);
}
}
}
}
这在以下 XML 片段上运行良好:
<testDoc>
<element>
<a>hello world</a>
<b>hello world again</b>
</element>
<element>
<a>foo</a>
<b>foo bar</b>
</element>
</testDoc>
但是,此片段存在问题,其中 和
位于同一行:
<testDoc>
<element>
<a>hello world</a>
<b>hello world again</b>
</element><element>
<a>foo</a>
<b>foo bar</b>
</element>
</testDoc>
在第二个示例中,它似乎只处理第一个元素,而不处理第二个元素。有什么想法吗?
更新:
我让它与以下代码一起使用:
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("c:\\temp\\test.xml");
XMLStreamReader xmlStreamReader = getReader(inputStream);
int count = 0;
while (xmlStreamReader.hasNext()) {
int eventType = xmlStreamReader.getEventType();
if (eventType == XMLEvent.START_ELEMENT) {
String elementName = xmlStreamReader.getName().getLocalPart();
if (!elementName.toLowerCase().equals("element")) {
xmlStreamReader.next();
continue;
}
System.out.println(readElement(xmlStreamReader));
} else {
xmlStreamReader.next();
}
}
}
I have the following code for converting the elements of an XML file into a String using Stax:
private static XMLStreamReader getReader(InputStream inputStream) throws XMLStreamException {
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
xmlInputFactory.setProperty("javax.xml.stream.isValidating", false);
xmlInputFactory.setProperty("javax.xml.stream.supportDTD", false);
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
return xmlStreamReader;
}
private static String readElement(XMLStreamReader reader) throws XMLStreamException, TransformerException {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
StAXSource source = new StAXSource(reader);
t.transform(source, new StreamResult(outputStream));
return outputStream.toString();
}
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("c:\\temp\\test.xml");
XMLStreamReader xmlStreamReader = getReader(inputStream);
int count = 0;
while (xmlStreamReader.hasNext()) {
int eventType = xmlStreamReader.next();
if (eventType == XMLEvent.START_ELEMENT) {
String elementName = xmlStreamReader.getName().getLocalPart();
if (!elementName.toLowerCase().equals("element")) {
continue;
}
String productStr = readElement(xmlStreamReader);
System.out.println(productStr);
}
}
}
}
This works fine on the following XML fragment:
<testDoc>
<element>
<a>hello world</a>
<b>hello world again</b>
</element>
<element>
<a>foo</a>
<b>foo bar</b>
</element>
</testDoc>
However, there are problems with this fragment where the </element>
and <element>
are on the same line:
<testDoc>
<element>
<a>hello world</a>
<b>hello world again</b>
</element><element>
<a>foo</a>
<b>foo bar</b>
</element>
</testDoc>
In the second example it only seems to process the first element and not the second one. Any ideas?
Update:
I got it to work with the following code:
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("c:\\temp\\test.xml");
XMLStreamReader xmlStreamReader = getReader(inputStream);
int count = 0;
while (xmlStreamReader.hasNext()) {
int eventType = xmlStreamReader.getEventType();
if (eventType == XMLEvent.START_ELEMENT) {
String elementName = xmlStreamReader.getName().getLocalPart();
if (!elementName.toLowerCase().equals("element")) {
xmlStreamReader.next();
continue;
}
System.out.println(readElement(xmlStreamReader));
} else {
xmlStreamReader.next();
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对我来说看起来像一个错误。你没有说你正在使用哪个 Stax 解析器:其中一些解析器非常笨拙。伍德斯托克斯是最可靠的。
Looks like a bug to me. You don't say which Stax parser you are using: some of them are pretty ropey. Woodstox is the most reliable.