是否有使用Saxparser读取复杂XML的通用方法?
我正在使用saxparser读取大型复杂XML文件。我不希望创建模型类,因为我不知道将在XML中出现的确切数据,因此我试图找到是否有使用某种上下文读取XML数据的通用方法。
我使用杰克逊(Jackson)使用了类似的JSON方法,这对我来说效果很好。由于我是萨克斯解析器的新手,因此我无法完全理解如何实现同样的事情。对于复杂的内部值,我无法建立亲子关系,并且无法在标签和属性之间建立关系。
以下是我到目前为止所拥有的代码:
上下文node
我的通用类,以使用父子关系来存储所有XML信息。
@Getter
@Setter
@ToString
@NoArgsConstructor
public class ContextNode {
protected String name;
protected String value;
protected ArrayList<ContextNode> children = new ArrayList<>();
protected ContextNode parent;
//Constructor 1: To store the simple field information.
public ContextNode(final String name, final String value) {
this.name = name;
this.value = value;
}
//Constructor 2: To store the complex field which has inner elements.
public ContextNode(final ContextNode parent, final String name, final String value) {
this(name, value);
this.parent = parent;
}
eventReader.class
中使用sax解析XML的方法
public class EventReader{
//Method to read XML events and create pre-hash string from it.
public static void xmlParser(final InputStream xmlStream) {
final SAXParserFactory factory = SAXParserFactory.newInstance();
try {
final SAXParser saxParser = factory.newSAXParser();
final SaxHandler handler = new SaxHandler();
saxParser.parse(xmlStream, handler);
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
}
}
}
以下
import org.xml.sax.Attributes;
import org.xml.sax.helpers.DefaultHandler;
import java.util.HashMap;
public class SaxHandler extends DefaultHandler {
private final List<String> XML_IGNORE_FIELDS = Arrays.asList("person:personDocument","DocumentBody","DocumentList");
private final List<String> EVENT_TYPES = Arrays.asList("person");
private Map<String, String> XML_NAMESPACES = null;
private ContextNode contextNode = null;
private StringBuilder currentValue = new StringBuilder();
@Override
public void startDocument() {
ConstantEventInfo.XML_NAMESPACES = new HashMap<>();
}
@Override
public void startElement(final String uri, final String localName, final String qName, final Attributes attributes) {
//For every new element in XML reset the StringBuilder.
currentValue.setLength(0);
if (qName.equalsIgnoreCase("person:personDocument")) {
// Add the attributes and name-spaces to Map
for (int att = 0; att < attributes.getLength(); att++) {
if (attributes.getQName(att).contains(":")) {
//Find all Namespaces within the XML Header information and save it to the Map for future use.
XML_NAMESPACES.put(attributes.getQName(att).substring(attributes.getQName(att).indexOf(":") + 1), attributes.getValue(att));
} else {
//Find all other attributes within XML and store this information within Map.
XML_NAMESPACES.put(attributes.getQName(att), attributes.getValue(att));
}
}
} else if (EVENT_TYPES.contains(qName)) {
contextNode = new ContextNode("type", qName);
}
}
@Override
public void characters(char ch[], int start, int length) {
currentValue.append(ch, start, length);
}
@Override
public void endElement(final String uri, final String localName, final String qName) {
if (!XML_IGNORE_FIELDS.contains(qName)) {
if (!EVENT_TYPES.contains(qName)) {
System.out.println("QName : " + qName + " Value : " + currentValue);
contextNode.children.add(new ContextNode(qName, currentValue.toString()));
}
}
}
@Override
public void endDocument() {
System.out.println(contextNode.getChildren().toString());
System.out.println("End of Document");
}
}
。
@Test
public void xmlReader() throws Exception {
final InputStream xmlStream = getClass().getResourceAsStream("/xmlFileContents.xml");
EventReader.xmlParser(xmlStream);
}
是我在 是我需要使用通用方法阅读的XML:
<?xml version="1.0" ?>
<person:personDocument xmlns:person="https://example.com" schemaVersion="1.2" creationDate="2020-03-03T13:07:51.709Z">
<DocumentBody>
<DocumentList>
<Person>
<bithTime>2020-03-04T11:00:30.000+01:00</bithTime>
<name>Batman</name>
<Place>London</Place>
<hobbies>
<hobby>painting</hobby>
<hobby>football</hobby>
</hobbies>
<jogging distance="10.3">daily</jogging>
<purpose2>
<id>1</id>
<purpose>Dont know</purpose>
</purpose2>
</Person>
</DocumentList>
</DocumentBody>
</person:personDocument>
I am using SaxParser to read the large complex XML file. I do not wish to create the model class as I do not know the exact data which will be coming in the XML so I am trying to find if there is a generic way of reading the XML data using some sort of Context.
I have used a similar approach for JSON using the Jackson, which worked very well for me. Since I am new to Sax Parser, I cannot completely understand how to achieve the same. for complex inner values, I am unable to establish a parent-child relationship and I am unable to build relationships between tags and attributes.
Following is the code I have so far:
ContextNode
my generic class to store all XML information using the parent-child relationships.
@Getter
@Setter
@ToString
@NoArgsConstructor
public class ContextNode {
protected String name;
protected String value;
protected ArrayList<ContextNode> children = new ArrayList<>();
protected ContextNode parent;
//Constructor 1: To store the simple field information.
public ContextNode(final String name, final String value) {
this.name = name;
this.value = value;
}
//Constructor 2: To store the complex field which has inner elements.
public ContextNode(final ContextNode parent, final String name, final String value) {
this(name, value);
this.parent = parent;
}
Following is my method to parse XML using SAX within EventReader.class
public class EventReader{
//Method to read XML events and create pre-hash string from it.
public static void xmlParser(final InputStream xmlStream) {
final SAXParserFactory factory = SAXParserFactory.newInstance();
try {
final SAXParser saxParser = factory.newSAXParser();
final SaxHandler handler = new SaxHandler();
saxParser.parse(xmlStream, handler);
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
}
}
}
Following is my SaxHandler
:
import org.xml.sax.Attributes;
import org.xml.sax.helpers.DefaultHandler;
import java.util.HashMap;
public class SaxHandler extends DefaultHandler {
private final List<String> XML_IGNORE_FIELDS = Arrays.asList("person:personDocument","DocumentBody","DocumentList");
private final List<String> EVENT_TYPES = Arrays.asList("person");
private Map<String, String> XML_NAMESPACES = null;
private ContextNode contextNode = null;
private StringBuilder currentValue = new StringBuilder();
@Override
public void startDocument() {
ConstantEventInfo.XML_NAMESPACES = new HashMap<>();
}
@Override
public void startElement(final String uri, final String localName, final String qName, final Attributes attributes) {
//For every new element in XML reset the StringBuilder.
currentValue.setLength(0);
if (qName.equalsIgnoreCase("person:personDocument")) {
// Add the attributes and name-spaces to Map
for (int att = 0; att < attributes.getLength(); att++) {
if (attributes.getQName(att).contains(":")) {
//Find all Namespaces within the XML Header information and save it to the Map for future use.
XML_NAMESPACES.put(attributes.getQName(att).substring(attributes.getQName(att).indexOf(":") + 1), attributes.getValue(att));
} else {
//Find all other attributes within XML and store this information within Map.
XML_NAMESPACES.put(attributes.getQName(att), attributes.getValue(att));
}
}
} else if (EVENT_TYPES.contains(qName)) {
contextNode = new ContextNode("type", qName);
}
}
@Override
public void characters(char ch[], int start, int length) {
currentValue.append(ch, start, length);
}
@Override
public void endElement(final String uri, final String localName, final String qName) {
if (!XML_IGNORE_FIELDS.contains(qName)) {
if (!EVENT_TYPES.contains(qName)) {
System.out.println("QName : " + qName + " Value : " + currentValue);
contextNode.children.add(new ContextNode(qName, currentValue.toString()));
}
}
}
@Override
public void endDocument() {
System.out.println(contextNode.getChildren().toString());
System.out.println("End of Document");
}
}
Following is my TestCase which will call the method xmlParser
@Test
public void xmlReader() throws Exception {
final InputStream xmlStream = getClass().getResourceAsStream("/xmlFileContents.xml");
EventReader.xmlParser(xmlStream);
}
Following is the XML I need to read using a generic approach:
<?xml version="1.0" ?>
<person:personDocument xmlns:person="https://example.com" schemaVersion="1.2" creationDate="2020-03-03T13:07:51.709Z">
<DocumentBody>
<DocumentList>
<Person>
<bithTime>2020-03-04T11:00:30.000+01:00</bithTime>
<name>Batman</name>
<Place>London</Place>
<hobbies>
<hobby>painting</hobby>
<hobby>football</hobby>
</hobbies>
<jogging distance="10.3">daily</jogging>
<purpose2>
<id>1</id>
<purpose>Dont know</purpose>
</purpose2>
</Person>
</DocumentList>
</DocumentBody>
</person:personDocument>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
提供答案,因为它可能会对将来有所帮助:
首先,我们需要创建一个可以保存信息的类
上下文node
:然后,我们可以读取XML并将信息存储在上下文节点中:
您可能需要根据您的特定要求进行一些其他更改。这只是一个给出一些想法的样本。
Providing the answer as it can be helpful to someone in the future:
First we need to create a class
ContextNode
which can hold the information:Then we can read the XML and store the information in the context node:
You may need to make some additional changes based on your particular requirement. This is just a sample that gives some idea.