XPath:生成从根节点到指定节点的相对表达式?

发布于 2024-10-10 14:43:06 字数 186 浏览 2 评论 0原文

如何生成所需的 XPath 表达式以从给定的根节点遍历 xml 结构到指定的节点?

我将在运行时收到表格的 HTML 片段。我必须根据某些条件找到所需的节点,并形成从表根节点到该节点的 XPath 字符串并将其返回。

HTML 表格结构事先是未知的。 Java中是否有任何API可以返回给定根节点和子节点的XPath字符串?

How can I generate the required XPath expression to traverse from a given root node to a specified node down the xml structure?

I will receive HTML fragment of a table at runtime. I have to find the desired node based on some criteria and the form an XPath string from the table root node to that node and return that.

The HTML table structure is not known beforehand. Is there any API in Java that returns the XPath string given the root node and the child node?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

睡美人的小仙女 2024-10-17 14:43:06

我建议在 Groovy 中执行此操作,它提供 GPATH (本质上是 groovy 语言的 xpath 实现。)Groovy 语法非常简洁且强大,如我的 blog 并与 Java 语言无缝混合(groovy 被编译为 java 类文件)。

至于你想要实现的目标......以下应该遍历整个 HTML DOM 结构并搜索具有特定 id 属性(例如 unique_id_for_tag)的“标签”(例如 div),每个条目都由闭包处理。

HTML.body.'**'.findAll {  it.name() == 'tag' && it["@id"] == 'tag_name' }.each { 
//"it" is the return value
if(it.td[0].text().toString().trim().contains('Hello')){
   var x = it.td[0].text().toString().trim();
}

I would recommend doing this in Groovy which provides GPATH (essentially an xpath implementation for the groovy language.) The Groovy syntax is very succint and powerful as described in my blog and mixes seamlessly with the he Java language (groovy is compiled down to java class files).

As for what you are trying to achieve...the following should traverse the entire HTML DOM structure and search for a "tag" (e.g. div) with a specific id attribute (e.g. unique_id_for_tag) with each entry found to be processed by the closure.

HTML.body.'**'.findAll {  it.name() == 'tag' && it["@id"] == 'tag_name' }.each { 
//"it" is the return value
if(it.td[0].text().toString().trim().contains('Hello')){
   var x = it.td[0].text().toString().trim();
}
瑾兮 2024-10-17 14:43:06

下面是(我知道的)实现此目的的一种方法

  1. 创建 XML 的 DOM
  2. 使用 "//" XPATH 获取指定节点的 Node
  3. 一旦你从步骤 2 中获得了 Node 对象,那么它就是只需使用 getParentNode() 遍历层次结构并构建 xpath

Below is one way (that I know) to achieve this

  1. Create a DOM of XML
  2. Get the Node of the specified node using the "//" XPATH
  3. Once you have the Node object from step 2 then it is just a matter of traversing up hierarchy using getParentNode() and building the xpath
亣腦蒛氧 2024-10-17 14:43:06

这不能(仅)在纯 XPath 1.0 中完成。

XPath 2.0 解决方案

if(not($vStart intersect $vTarget/ancestor::*))
  then ()
  else
   for $vPath in
      string-join
          ((for $x in
                $vTarget
                  /ancestor-or-self::*[. >> $vStart]
                    /concat(name(.),
                            for $n in name(.),
                                $cn in count(../*[name(.) eq $n])
                             return
                               if($cn ge 2)
                                 then concat('[', 
                                               count((preceding-sibling::*
                                                              [name() eq $n]) +1, 
                                             ']')
                                 else (),
                            '/'
                               )
               return $x),
              ''
           )
           return string-join((concat(name($vStart), '/'),$vPath), '')

当根据以下 XML 文档计算此 XPath 2.0 表达式时

<table>
  <tr>
    <td><b>11</b></td>
    <td><i>12</i></td>
  </tr>
  <tr>
    <td><p><b>21</b></p></td>
    <td><p><b>221</b></p><p><b><i>222</i></b></p></td>
  </tr>
  <tr>
    <td><b>31</b></td>
    <td><i>32</i></td>
  </tr>
</table>

如果两个参数定义为

  <xsl:variable name="vStart" select="/*"/>
  <xsl:variable name="vTarget" select="/*/tr[2]/td[2]/p[2]/b/i"/>

那么上面的 XPath 2.0 表达式的计算结果为:

table/tr[2]/td[2]/p[2]/b/i/

This cannot be done (only) in pure XPath 1.0.

XPath 2.0 solution:

if(not($vStart intersect $vTarget/ancestor::*))
  then ()
  else
   for $vPath in
      string-join
          ((for $x in
                $vTarget
                  /ancestor-or-self::*[. >> $vStart]
                    /concat(name(.),
                            for $n in name(.),
                                $cn in count(../*[name(.) eq $n])
                             return
                               if($cn ge 2)
                                 then concat('[', 
                                               count((preceding-sibling::*
                                                              [name() eq $n]) +1, 
                                             ']')
                                 else (),
                            '/'
                               )
               return $x),
              ''
           )
           return string-join((concat(name($vStart), '/'),$vPath), '')

When this XPath 2.0 expression is evaluated against the following XML document:

<table>
  <tr>
    <td><b>11</b></td>
    <td><i>12</i></td>
  </tr>
  <tr>
    <td><p><b>21</b></p></td>
    <td><p><b>221</b></p><p><b><i>222</i></b></p></td>
  </tr>
  <tr>
    <td><b>31</b></td>
    <td><i>32</i></td>
  </tr>
</table>

and if the two parameters are defined as:

  <xsl:variable name="vStart" select="/*"/>
  <xsl:variable name="vTarget" select="/*/tr[2]/td[2]/p[2]/b/i"/>

then the result of the evaluation of the XPath 2.0 expression above is:

table/tr[2]/td[2]/p[2]/b/i/
吾性傲以野 2024-10-17 14:43:06

如果您知道要选择的根元素和子元素的名称,并且只有一个具有该名称的子元素,则可以简单地使用“/root//child”。但也许我误解了你想要实现的目标。你能举个例子吗?

If you know the names of the root element and the child element you are trying to select, and if there is only one child element with that name, you could use simply "/root//child". But maybe I misunderstood what you were trying to achieve. Could you give an example ?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文