XML 解析器给出 null 元素

发布于 2024-10-09 19:30:24 字数 2147 浏览 1 评论 0原文

当我尝试解析 XML 文件时,它有时会在标题处给出一个 null 元素。 我认为这与 HTML 标签 ' 有关

我该如何解决这个问题?

我有以下 XML 文件:

<item>
<title>&#039; Nieuwe DVD &#039;</title>
<description>tekst, tekst tekst</description>
<link>dvd.html</link>
<category>nieuws</category>
<pubDate>Sat, 1 Jan 2011 9:24:00 +0000</pubDate>
</item>

以及解析 xml 文件的以下代码:

//DocumentBuilderFactory, DocumentBuilder are used for 
      //xml parsing
      DocumentBuilderFactory dbf = DocumentBuilderFactory
        .newInstance();
      DocumentBuilder db = dbf.newDocumentBuilder();

      //using db (Document Builder) parse xml data and assign
      //it to Element
      Document document = db.parse(is);
      Element element = document.getDocumentElement();

      //take rss nodes to NodeList
      element.normalize();

      NodeList nodeList = element.getElementsByTagName("item");

      if (nodeList.getLength() > 0) 
      {
       for (int i = 0; i < nodeList.getLength(); i++) 
       {
        //take each entry (corresponds to <item></item> tags in 
        //xml data

        Element entry = (Element) nodeList.item(i);
        entry.normalize();
        Element _titleE = (Element) entry.getElementsByTagName(
          "title").item(0);

        Element _categoryE = (Element) entry
          .getElementsByTagName("category").item(0);
        Element _pubDateE = (Element) entry
          .getElementsByTagName("pubDate").item(0);
        Element _linkE = (Element) entry.getElementsByTagName(
          "link").item(0);

        String _title = _titleE.getFirstChild().getNodeValue();
        String _category = _categoryE.getFirstChild().getNodeValue();
        Date _pubDate = new Date(_pubDateE.getFirstChild().getNodeValue());
        String _link = _linkE.getFirstChild().getNodeValue();

        //create RssItemObject and add it to the ArrayList
        RssItem rssItem = new RssItem(_title, _category, _pubDate, _link);

        rssItems.add(rssItem);
        conn.disconnect();
       }

When I try to parse a XML-file, it gives sometimes a null element by the title.
I think it has to do with HTML-tags '

How can I solve this problem?

I have the follow XML-file:

<item>
<title>' Nieuwe DVD '</title>
<description>tekst, tekst tekst</description>
<link>dvd.html</link>
<category>nieuws</category>
<pubDate>Sat, 1 Jan 2011 9:24:00 +0000</pubDate>
</item>

And the follow code to parse the xml-file:

//DocumentBuilderFactory, DocumentBuilder are used for 
      //xml parsing
      DocumentBuilderFactory dbf = DocumentBuilderFactory
        .newInstance();
      DocumentBuilder db = dbf.newDocumentBuilder();

      //using db (Document Builder) parse xml data and assign
      //it to Element
      Document document = db.parse(is);
      Element element = document.getDocumentElement();

      //take rss nodes to NodeList
      element.normalize();

      NodeList nodeList = element.getElementsByTagName("item");

      if (nodeList.getLength() > 0) 
      {
       for (int i = 0; i < nodeList.getLength(); i++) 
       {
        //take each entry (corresponds to <item></item> tags in 
        //xml data

        Element entry = (Element) nodeList.item(i);
        entry.normalize();
        Element _titleE = (Element) entry.getElementsByTagName(
          "title").item(0);

        Element _categoryE = (Element) entry
          .getElementsByTagName("category").item(0);
        Element _pubDateE = (Element) entry
          .getElementsByTagName("pubDate").item(0);
        Element _linkE = (Element) entry.getElementsByTagName(
          "link").item(0);

        String _title = _titleE.getFirstChild().getNodeValue();
        String _category = _categoryE.getFirstChild().getNodeValue();
        Date _pubDate = new Date(_pubDateE.getFirstChild().getNodeValue());
        String _link = _linkE.getFirstChild().getNodeValue();

        //create RssItemObject and add it to the ArrayList
        RssItem rssItem = new RssItem(_title, _category, _pubDate, _link);

        rssItems.add(rssItem);
        conn.disconnect();
       }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

我还不会笑 2024-10-16 19:30:24

当您确实需要 getTextContent 时,请勿使用 getFirstElement

Don't use getFirstElement when you really want getTextContent.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文