XPath 表达式对 //element 不返回任何内容，但 //* 返回一个计数

发布于 2024-08-23 06:01:38 字数 673 浏览 12 评论 0原文

我将 XOM 与以下示例数据一起使用：

Element root = cleanDoc.getRootElement();
//find all the bold elements, as those mark institution and clinic.
Nodes nodes = root.query("//*");

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">
    <head>
        <title>Patient Information</title>
    </head>
</html>

以下元素返回许多元素（来自真实数据）：

//*

但类似“

//head

不返回任何内容”。如果我遍历根的子元素，数字似乎是匹配的，如果我打印元素名称，一切看起来都是正确的。

我正在获取 HTML，使用 tagoup 对其进行解析，然后从结果字符串构建 XOM 文档。其中哪一部分会出现如此严重的错误？我觉得这里发生了一些奇怪的编码问题，但我只是没有看到它。 Java 字符串就是字符串，对吗？

原文

I'm using XOM with the following sample data:

Element root = cleanDoc.getRootElement();
//find all the bold elements, as those mark institution and clinic.
Nodes nodes = root.query("//*");

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">
    <head>
        <title>Patient Information</title>
    </head>
</html>

The following element returns many elements (from real data):

//*

but something like

//head

Returns nothing. If I run through the children of the root, the numbers seem to match up, and if I print the element name, everything seems to look correct.

I'm taking HTML, parsing it with tagsoup, and then building a XOM Document from the resulting string. What part of this could go so horribly wrong? I feel there's some weird encoding issue going on here, but I'm just not seeing it. Java Strings are Strings, right?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

野生奥特曼 2024-08-30 06:01:38

您的文档有一个默认名称空间，这意味着在 XPath 模型中，所有元素都位于该名称空间中。

查询应为 //html:head。您必须提供到 XPath 查询的命名空间映射。

请注意，虽然 XPath 表达式使用命名空间前缀，但必须匹配命名空间 uri。

XPathContext ctx = new XPathContext("html", "http://www.w3.org/1999/xhtml");
Nodes nodes = root.query("//html:head", ctx );

Your document has a default namespace, which means in the XPath model all the elements are in that namespace.

The query should be //html:head. You will have to supply the namespace mapping to the XPath query.

Note that while the XPath expression uses a namespace prefix, it is the namespace uri that must match.

XPathContext ctx = new XPathContext("html", "http://www.w3.org/1999/xhtml");
Nodes nodes = root.query("//html:head", ctx );

回复收藏 0 原文

~没有更多了~

关于作者

寂寞陪衬

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

XPath 表达式对 //element 不返回任何内容，但 //* 返回一个计数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

我早已燃尽

就像说晚安

donghfcn

脱单之前绝不改名′

凡尘雨

鲜血染红嫁衣

友情链接

XPath 表达式对 //element 不返回任何内容，但 //* 返回一个计数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

我早已燃尽

就像说晚安

donghfcn

脱单之前绝不改名′

凡尘雨

鲜血染红嫁衣

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。