使用 Commons JXPath 解析 XML 的问题

发布于 2024-12-05 12:19:41 字数 2572 浏览 0 评论 0原文

我正在尝试使用 Apache Commons JXPath 解析 XML。但由于某种原因，在解析 xml 后它无法识别子节点。这是示例代码：

private static void processUrl(String seed){
    String test = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><feed xmlns=\"http://www.w3.org/2005/Atom\" xmlns:media=\"http://search.yahoo.com/mrss/\" xmlns:openSearch=\"http://a9.com/-/spec/opensearchrss/1.0/\" xmlns:gd=\"http://schemas.google.com/g/2005\" xmlns:yt=\"http://gdata.youtube.com/schemas/2007\"><id>http://gdata.youtube.com/feeds/api/videos</id><logo>http://www.youtube.com/img/pic_youtubelogo_123x63.gif</logo><link rel=\"alternate\" type=\"text/html\" href=\"http://www.youtube.com\"/><author><name>YouTube</name><uri>http://www.youtube.com/</uri></author><generator version=\"2.1\" uri=\"http://gdata.youtube.com\">YouTube data API</generator><openSearch:totalResults>144</openSearch:totalResults><entry><id>http://gdata.youtube.com/feeds/api/videos/P1lDDu9L5YQ</id><published>2010-09-20T17:41:38.000Z</published><updated>2011-09-18T22:15:38.000Z</updated><category scheme=\"http://schemas.google.com/g/2005#kind\" term=\"http://gdata.youtube.com/schemas/2007#video\"/><link rel=\"alternate\" type=\"text/html\" href=\"http://www.youtube.com/watch?v=P1lDDu9L5YQ&amp;feature=youtube_gdata\"/></entry></feed>";
    Document doc = null;
    try{
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        ByteArrayInputStream bais = new ByteArrayInputStream(test.toString().getBytes("UTF8"));
        doc = builder.parse(bais);
        bais.close();

        JXPathContext ctx = JXPathContext.newContext(doc);
        List entryNodes = ctx.selectNodes("/feed/entry");
        System.out.println("number of threadNodes " + entryNodes.size());
        int totalThreads = 0;
        for (Object each : entryNodes) {
            totalThreads++;
            Node eachEntryNode = (Node) each;
            JXPathContext msgCtx = JXPathContext.newContext(eachEntryNode);
            String title = (String) msgCtx.getValue("title");
        }
    }catch (Exception ex) {
        ex.printStackTrace();
    }
}

我之前使用过 JXPath，从未遇到过任何问题。我调试了文档对象，它似乎没有子节点（）。我所能看到的只是根元素。我也尝试过 DOMParser 但没有任何运气。

DOMParser parser = new DOMParser();
        Document doc = (Document) parser.parseXML(new ByteArrayInputStream(sb0.toString().getBytes("UTF-8")));

如果有人可以提供有关此用法的指示，我将不胜感激。

原文

I'm trying to parse a XML using Apache Commons JXPath. But for some reason, its not able to identify the child nodes after the xml is being parsed. Here's the sample code :

private static void processUrl(String seed){
    String test = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><feed xmlns=\"http://www.w3.org/2005/Atom\" xmlns:media=\"http://search.yahoo.com/mrss/\" xmlns:openSearch=\"http://a9.com/-/spec/opensearchrss/1.0/\" xmlns:gd=\"http://schemas.google.com/g/2005\" xmlns:yt=\"http://gdata.youtube.com/schemas/2007\"><id>http://gdata.youtube.com/feeds/api/videos</id><logo>http://www.youtube.com/img/pic_youtubelogo_123x63.gif</logo><link rel=\"alternate\" type=\"text/html\" href=\"http://www.youtube.com\"/><author><name>YouTube</name><uri>http://www.youtube.com/</uri></author><generator version=\"2.1\" uri=\"http://gdata.youtube.com\">YouTube data API</generator><openSearch:totalResults>144</openSearch:totalResults><entry><id>http://gdata.youtube.com/feeds/api/videos/P1lDDu9L5YQ</id><published>2010-09-20T17:41:38.000Z</published><updated>2011-09-18T22:15:38.000Z</updated><category scheme=\"http://schemas.google.com/g/2005#kind\" term=\"http://gdata.youtube.com/schemas/2007#video\"/><link rel=\"alternate\" type=\"text/html\" href=\"http://www.youtube.com/watch?v=P1lDDu9L5YQ&feature=youtube_gdata\"/></entry></feed>";
    Document doc = null;
    try{
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        ByteArrayInputStream bais = new ByteArrayInputStream(test.toString().getBytes("UTF8"));
        doc = builder.parse(bais);
        bais.close();

        JXPathContext ctx = JXPathContext.newContext(doc);
        List entryNodes = ctx.selectNodes("/feed/entry");
        System.out.println("number of threadNodes " + entryNodes.size());
        int totalThreads = 0;
        for (Object each : entryNodes) {
            totalThreads++;
            Node eachEntryNode = (Node) each;
            JXPathContext msgCtx = JXPathContext.newContext(eachEntryNode);
            String title = (String) msgCtx.getValue("title");
        }
    }catch (Exception ex) {
        ex.printStackTrace();
    }
}

I've used JXPath earlier and never had any issues. I debugged the document object,it doesn't seemed to have the child node () for . All I'm able to see is the root element. I also tried DOMParser without any luck.

DOMParser parser = new DOMParser();
        Document doc = (Document) parser.parseXML(new ByteArrayInputStream(sb0.toString().getBytes("UTF-8")));

I'll appreciate if someone can provide pointers to this isuse.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

自此以后，行同陌路 2024-12-12 12:19:41

此问题与 JXPath 处理默认名称空间的方式有关，它严格遵循 XPath 1.0 规范。这也解释了为什么它在删除默认命名空间 http://www.w3.org/2005/Atom 后仍然有效。为了让它与默认命名空间一起工作，您可以执行以下操作：

JXPathContext ctx = JXPathContext.newContext(doc.getDocumentElement());
// Register the default namespace, giving it a prefix of your choice
ctx.registerNamespace("myfeed", "http://www.w3.org/2005/Atom");

// Now query for entry elements using the registered prefix
List entryNodes = ctx.selectNodes("myfeed:entry");

有关该问题的更多信息，请参阅以下链接。

http://markmail.org/message/7iqw4bjrkwerbh46

让 jxpath 命名空间感知

This issue has to do with how JXPath handles default namespaces, which closely follows the XPath 1.0 specification. This also explains why it worked after you removed the default namespace http://www.w3.org/2005/Atom. In order to get it to work with the default namespace you can do the following:

JXPathContext ctx = JXPathContext.newContext(doc.getDocumentElement());
// Register the default namespace, giving it a prefix of your choice
ctx.registerNamespace("myfeed", "http://www.w3.org/2005/Atom");

// Now query for entry elements using the registered prefix
List entryNodes = ctx.selectNodes("myfeed:entry");

For more information on the issue see the following links.

http://markmail.org/message/7iqw4bjrkwerbh46

Make jxpath namespace aware

回复收藏 0 原文

~没有更多了~