遍历 DOM 树以获取(名称,值)属性对和叶节点

发布于 2024-12-01 06:50:00 字数 673 浏览 1 评论 0原文

我想遍历 DOM 中的 XML 文件,以便检索所有(名称,值)对:

  1. 属性名称和值;
  2. 所有叶子节点名称及其文本内容;

以下面的 XML 文件为例:

<?xml version="1.0" encoding="UTF-8"?>
<title text="title1">
    <comment id="comment1">
        <data> abcd </data>
        <data> efgh </data>
    </comment>
    <comment id="comment2">
        <data> ijkl </data>
        <data> mnop </data>
        <data> qrst </data>
    </comment>
</title>

我想要的名称值对是:

text=title1
id=comment1
data=abcd
data=efgh
id=commment2
data=ijkl
data=mnop
data=qrst

I want to traverse through an XML file in DOM for the purpose of retrieving as (name,value) pairs all:

  1. Attribute names and values;
  2. All leaf node names and their text content;

So given the following XML file as an example:

<?xml version="1.0" encoding="UTF-8"?>
<title text="title1">
    <comment id="comment1">
        <data> abcd </data>
        <data> efgh </data>
    </comment>
    <comment id="comment2">
        <data> ijkl </data>
        <data> mnop </data>
        <data> qrst </data>
    </comment>
</title>

What I want as name value pairs are:

text=title1
id=comment1
data=abcd
data=efgh
id=commment2
data=ijkl
data=mnop
data=qrst

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一桥轻雨一伞开 2024-12-08 06:50:00

更简单的解决方案可能是使用 XPath 提取所有名称值对,如下例所示。您还可以跳过 DOM 构造并直接在 InputSource 上调用评估。 XPath 表达式

//@* | //*[not(*)]

匹配所有属性和所有没有任何子节点的节点的并集。

import java.io.StringReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;

public class Test {

    private static final String xml = "<title text='title1'>\n"
            + "  <comment id='comment1'>\n"
            + "    <data> abcd </data>\n"
            + "    <data> efgh </data>\n"
            + "  </comment>\n"
            + "  <comment id='comment2'>\n"
            + "    <data> ijkl </data>\n"
            + "    <data> mnop </data>\n"
            + "    <data> qrst </data>\n"
            + "  </comment>\n"
            + "</title>\n";

    public static void main(String[] args) throws Exception {
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new InputSource(new StringReader(xml)));

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xp = xpf.newXPath();
        NodeList nodes = (NodeList)xp.evaluate("//@* | //*[not(*)]", doc, XPathConstants.NODESET);

        System.out.println(nodes.getLength());

        for (int i=0, len=nodes.getLength(); i<len; i++) {
            Node item = nodes.item(i);
            System.out.println(item.getNodeName() + " : " + item.getTextContent());
        }
    }
}

An easier solution might be to use XPath to extract all name value pairs as in the following example. You could also skip the DOM construction and call evaluate directly on the InputSource. The XPath expression

//@* | //*[not(*)]

matches the union of all attributes and all nodes that don't have any child nodes.

import java.io.StringReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;

public class Test {

    private static final String xml = "<title text='title1'>\n"
            + "  <comment id='comment1'>\n"
            + "    <data> abcd </data>\n"
            + "    <data> efgh </data>\n"
            + "  </comment>\n"
            + "  <comment id='comment2'>\n"
            + "    <data> ijkl </data>\n"
            + "    <data> mnop </data>\n"
            + "    <data> qrst </data>\n"
            + "  </comment>\n"
            + "</title>\n";

    public static void main(String[] args) throws Exception {
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new InputSource(new StringReader(xml)));

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xp = xpf.newXPath();
        NodeList nodes = (NodeList)xp.evaluate("//@* | //*[not(*)]", doc, XPathConstants.NODESET);

        System.out.println(nodes.getLength());

        for (int i=0, len=nodes.getLength(); i<len; i++) {
            Node item = nodes.item(i);
            System.out.println(item.getNodeName() + " : " + item.getTextContent());
        }
    }
}
溇涏 2024-12-08 06:50:00

怎么样:

    String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
        "<title text=\"title1\">\n" +
        "    <comment id=\"comment1\">\n" +
        "        <data> abcd </data>\n" +
        "        <data> efgh </data>\n" +
        "    </comment>\n" +
        "    <comment id=\"comment2\">\n" +
        "        <data> ijkl </data>\n" +
        "        <data> mnop </data>\n" +
        "        <data> qrst </data>\n" +
        "    </comment>\n" +
        "</title>\n";

    try {
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new InputSource(new StringReader(xml)));

        DocumentTraversal traversal = (DocumentTraversal) doc;

        NodeIterator iterator = traversal.createNodeIterator(
          doc.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);

        for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
            //System.out.println("Element: " + ((Element) n).getTagName());
            String tagname = ((Element) n).getTagName();
            if(tagname.equals("title")) {
                System.out.println("text=" + ((Element)n).getAttribute("text"));
            }
            else if(tagname.equals("comment")) {
                System.out.println("id=" + ((Element)n).getAttribute("id"));
            }
            else if(tagname.equals("data")) {
                System.out.println("data=" + ((Element)n).getTextContent());
            }
            else {
                System.out.println("Unhandled element");
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }

好吧,所以你对此不满意,这样怎么样:

 String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
        "<title text=\"title1\">\n" +
        "    <comment id=\"comment1\">\n" +
        "        <data> abcd </data>\n" +
        "        <data> efgh </data>\n" +
        "    </comment>\n" +
        "    <comment id=\"comment2\">\n" +
        "        <data> ijkl </data>\n" +
        "        <data> mnop </data>\n" +
        "        <data> qrst </data>\n" +
        "    </comment>\n" +
        "</title>\n";

    try {
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new InputSource(new StringReader(xml)));

        DocumentTraversal traversal = (DocumentTraversal) doc;

        NodeIterator iterator = traversal.createNodeIterator(
          doc.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);

        for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
            //System.out.println("Element: " + ((Element) n).getTagName());
            String tagname = ((Element) n).getTagName();

            NamedNodeMap map = ((Element)n).getAttributes();
            if(map.getLength() > 0) {

                for(int i=0; i<map.getLength(); i++) {
                    Node node = map.item(i);
                    System.out.println(node.getNodeName() + "=" + node.getNodeValue());
                }
            }
            else {
                System.out.println(tagname + "=" + ((Element)n).getTextContent());
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }

How about something like:

    String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
        "<title text=\"title1\">\n" +
        "    <comment id=\"comment1\">\n" +
        "        <data> abcd </data>\n" +
        "        <data> efgh </data>\n" +
        "    </comment>\n" +
        "    <comment id=\"comment2\">\n" +
        "        <data> ijkl </data>\n" +
        "        <data> mnop </data>\n" +
        "        <data> qrst </data>\n" +
        "    </comment>\n" +
        "</title>\n";

    try {
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new InputSource(new StringReader(xml)));

        DocumentTraversal traversal = (DocumentTraversal) doc;

        NodeIterator iterator = traversal.createNodeIterator(
          doc.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);

        for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
            //System.out.println("Element: " + ((Element) n).getTagName());
            String tagname = ((Element) n).getTagName();
            if(tagname.equals("title")) {
                System.out.println("text=" + ((Element)n).getAttribute("text"));
            }
            else if(tagname.equals("comment")) {
                System.out.println("id=" + ((Element)n).getAttribute("id"));
            }
            else if(tagname.equals("data")) {
                System.out.println("data=" + ((Element)n).getTextContent());
            }
            else {
                System.out.println("Unhandled element");
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }

Okay, so you weren't happy with that, how about this:

 String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
        "<title text=\"title1\">\n" +
        "    <comment id=\"comment1\">\n" +
        "        <data> abcd </data>\n" +
        "        <data> efgh </data>\n" +
        "    </comment>\n" +
        "    <comment id=\"comment2\">\n" +
        "        <data> ijkl </data>\n" +
        "        <data> mnop </data>\n" +
        "        <data> qrst </data>\n" +
        "    </comment>\n" +
        "</title>\n";

    try {
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new InputSource(new StringReader(xml)));

        DocumentTraversal traversal = (DocumentTraversal) doc;

        NodeIterator iterator = traversal.createNodeIterator(
          doc.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);

        for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
            //System.out.println("Element: " + ((Element) n).getTagName());
            String tagname = ((Element) n).getTagName();

            NamedNodeMap map = ((Element)n).getAttributes();
            if(map.getLength() > 0) {

                for(int i=0; i<map.getLength(); i++) {
                    Node node = map.item(i);
                    System.out.println(node.getNodeName() + "=" + node.getNodeValue());
                }
            }
            else {
                System.out.println(tagname + "=" + ((Element)n).getTextContent());
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文