re:test() 到 HtmlAgilityPack 的 XPath(获取所有具有匹配内部正则表达式的 p 标签)
我想要所有 =.+=
标签。正则表达式独立工作,无需
标签。
这是我的 XPath: "//p[re:test(.,'^=.+=$', 'i')]"
但是当我将其插入时遇到异常,
HtmlNodeCollection pNodes = htmlDoc.DocumentNode.SelectNodes("//p[re:test(.,'^=.+=$', 'i')]");
例外的是:
命名空间管理器或 XsltContext 需要。该查询有一个前缀, 变量或用户定义的函数。
编辑:Html 由 FCKEditor 生成,没有定义命名空间。我需要设置一些东西才能使其工作吗?
HTML:
<p><style type="text/css">
h2 a { color: black; }</style></p>
<p>----</p>
<h2>test <a href="http://searisen.com">link</a></h2>
<p>== Heading 2 ==</p>
<p>----</p>
<p>=== Heading [http://searisen.com SeaRisen.com] ===</p>
I want all <p>=.+=</p>
tags. The Regex works on its own, without the <p>
tags.
Here's my XPath: "//p[re:test(.,'^=.+=$', 'i')]"
But I'm getting an exception when I plug it into,
HtmlNodeCollection pNodes = htmlDoc.DocumentNode.SelectNodes("//p[re:test(.,'^=.+=
The exception is:
Namespace Manager or XsltContext
needed. This query has a prefix,
variable, or user-defined function.
Edit: The Html is generated by FCKEditor and has no namespace defined. Do I need to set something for this to work?
The HTML:
<p><style type="text/css">
h2 a { color: black; }</style></p>
<p>----</p>
<h2>test <a href="http://searisen.com">link</a></h2>
<p>== Heading 2 ==</p>
<p>----</p>
<p>=== Heading [http://searisen.com SeaRisen.com] ===</p>
, 'i')]");
The exception is:
Namespace Manager or XsltContext
needed. This query has a prefix,
variable, or user-defined function.
Edit: The Html is generated by FCKEditor and has no namespace defined. Do I need to set something for this to work?
The HTML:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
显然 HtmlAgilityPack 不处理名称空间(不是我有的)。所以我想出了这个技巧,
如果有 HtmlAgilityPack 解决方案,我很想听听!
Apparently HtmlAgilityPack doesn't handle namespaces (not that I had one). So I've come up with this hack,
If there is an HtmlAgilityPack solution I'd love to hear it!
您遇到的错误是由于表达式
re:test
使用名为test
的 XPATH 函数(在前缀为re
的命名空间中声明) >),这对于 XSLT 上下文来说是未知的。我不知道你从哪里得到这个表达式,但它不是标准的,所以它在 Html Agility Pack 上下文中没有任何意义 :-)
要深入解释,请参阅此处这篇很酷的文章: 向 XPath 添加自定义函数。请注意,您可以使用这些技术使其工作。
也就是说,这里是一个“纯”Html Agility Pack / XPATH 实现:
它使用过滤器(在 [ 和 ] 之间)和标准 XPATH 函数 text() ,这意味着“内部文本”。
The error you have is due to the fact that the expression
re:test
uses an XPATH function namedtest
(declared in a namespace whose prefix isre
), that is unknown to the XSLT context.I don't know where you got that expression from, but it's not standard, so it means nothing in the Html Agility Pack context :-)
For indepth explanation, see this cool article here: Adding Custom Functions to XPath. Note you could make it work using these techniques.
That said, here a "pure" Html Agility Pack / XPATH implementation:
It uses a filter (between [ and ]) and the standard XPATH function text() which means "inner text".
回应 Simon Mourier 所说的,re:test() 函数不是核心 XPath 函数。它可在 Calibre 的 XPath 函数集中使用 (http://manual.calibre- ebook.com/xpath.html#term-re-test),但这是一个非标准扩展。除了 Calibre 之外,我不知道有任何其他系统可能会暴露 re:test() 函数。
有关核心 XPath 函数和 XSLT 扩展函数的详细摘要,请参阅 https:/ /developer.mozilla.org/en-US/docs/Web/XPath/Functions
To echo what Simon Mourier said, The re:test() function is not a core XPath function. It is available in Calibre's XPath function set (http://manual.calibre-ebook.com/xpath.html#term-re-test), but that is a non-standard extension. I am not aware of any other systems, besides Calibre, that may expose the re:test() function.
For a good summary of core XPath functions and XSLT extension functions, see https://developer.mozilla.org/en-US/docs/Web/XPath/Functions