XPath、平面层次结构和停止条件

发布于 2024-11-27 00:28:41 字数 1059 浏览 5 评论 0原文

我需要从非常糟糕的 XML 构造 Start 对象。我为一种情况制作了 SAX 解析器，但它很混乱，我想尝试一下 XPath。

我有以下 XML：

<doc>
    <start/>
    <a/>
    <b/>
    <item/>
    <item/>
    <item/>

    <start/>
    <item/>
    <item/>
    <item/>

    <start/>
    <b/>
    <item/>
    <item/>
    <item/>

</doc>

但是我更喜欢这个文档（我没有）：

<doc>
    <start>
        <a/>
        <b/>
        <item/>
        <item/>
        <item/>
    <start/>

    <start>
        <item/>
        <item/>
        <item/>
    <start/>

    <start>
        <b/>
        <item/>
        <item/>
       <item/>
    <start/>

</doc>

假设我有第二个“开始”节点对象（来自第一个 XML 示例）。现在我想获得直接跟随该节点的“a”和“b”元素。但是，如果我从该节点（带有以下兄弟节点）对“b”节点进行相对查询，我将在第三个起始节点下获得节点。是否可以说“找到此节点之后的节点 X，但停在节点 Y（返回 null）”？

我知道我可以使用“|”或多个查询但这不是我想要的（尽管它也可能解决我的问题）。

谢谢。

原文

I need to construct Start objects from very bad XML. I made SAX parser for one case but it's messy and I would like to try XPath.

I have following XML:

<doc>
    <start/>
    <a/>
    <b/>
    <item/>
    <item/>
    <item/>

    <start/>
    <item/>
    <item/>
    <item/>

    <start/>
    <b/>
    <item/>
    <item/>
    <item/>

</doc>

However I would much more like this document (which I don't have):

<doc>
    <start>
        <a/>
        <b/>
        <item/>
        <item/>
        <item/>
    <start/>

    <start>
        <item/>
        <item/>
        <item/>
    <start/>

    <start>
        <b/>
        <item/>
        <item/>
       <item/>
    <start/>

</doc>

Suppose please that I have 2nd "start" node object (from 1st XML example). Now I'd like to get "a" and "b" elements directly following this node. However if I make relative query from this node (with following-sibling) for "b" node I will get node under 3rd start node. Is it possible to say "find node X following this node but stop on node Y (return null)" ?

I know I can use "|" to OR multiple queries but this is not what I want (though it could possibly solve my problem too).

Thank you.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

天气好吗我好吗 2024-12-04 00:28:41

如果您使用 XSLT 1.0，您还可以使用键 xsl:key 对相邻同级进行分组，从而简化 XPath 表达式：

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>

    <xsl:key name="k_adjChild"
        match="/*/*[not(self::start)]"
        use="generate-id(preceding-sibling::start[1])"
        />

    <xsl:template match="doc">
        <doc>
            <xsl:apply-templates select="start"/>
        </doc>
    </xsl:template>

    <xsl:template match="start">
        <xsl:copy>
            <xsl:copy-of select="key('k_adjChild', generate-id())" />
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

If you go with XSLT 1.0 you can also group adjacent siblings by using a key xsl:key, thus simplifying XPath expressions:

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>

    <xsl:key name="k_adjChild"
        match="/*/*[not(self::start)]"
        use="generate-id(preceding-sibling::start[1])"
        />

    <xsl:template match="doc">
        <doc>
            <xsl:apply-templates select="start"/>
        </doc>
    </xsl:template>

    <xsl:template match="start">
        <xsl:copy>
            <xsl:copy-of select="key('k_adjChild', generate-id())" />
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

回复收藏 0 原文

要走就滚别墨迹 2024-12-04 00:28:41

假设上下文是特定的元素，则此 XPath 将选择当前和后续 之间的所有节点;。

following-sibling::node()[not(self::start)]
                         [generate-id(preceding-sibling::start[1]) 
                           = generate-id(current())]

此 XSLT 应用该 XPath 以便按元素对内容进行分组。

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="doc">
        <xsl:copy>
            <xsl:apply-templates select="@*|start" />
        </xsl:copy>
    </xsl:template>

    <!--for each start element, copy it,
        apply templates for it's attributes(in case any exist)
        and for nodes() that are following-siblings
        who's first preceeding-sibling is this start element-->
    <xsl:template match="start">
        <xsl:copy>
            <xsl:apply-templates select="@*
                | following-sibling::node()[not(self::start)]
                    [generate-id(preceding-sibling::start[1]) 
                      = generate-id(current())]" />
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

Assuming the context is a particular <start> element, this XPath will select all of the nodes between the current <start> and the following <start>.

following-sibling::node()[not(self::start)]
                         [generate-id(preceding-sibling::start[1]) 
                           = generate-id(current())]

This XSLT applies that XPath in order to group the content by the <start> elements.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="doc">
        <xsl:copy>
            <xsl:apply-templates select="@*|start" />
        </xsl:copy>
    </xsl:template>

    <!--for each start element, copy it,
        apply templates for it's attributes(in case any exist)
        and for nodes() that are following-siblings
        who's first preceeding-sibling is this start element-->
    <xsl:template match="start">
        <xsl:copy>
            <xsl:apply-templates select="@*
                | following-sibling::node()[not(self::start)]
                    [generate-id(preceding-sibling::start[1]) 
                      = generate-id(current())]" />
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

回复收藏 0 原文

佼人 2024-12-04 00:28:41

假设输入 XML 位于文件 in.xml 中，此 XQuery 脚本将执行您想要的操作：

(:
  This library function can be found here:
  http://www.xqueryfunctions.com/xq/functx_index-of-node.html
:)
declare namespace functx = "http://www.functx.com"; 
declare function functx:index-of-node($nodes as node()* ,
$nodeToFind as node() )  as xs:integer* 
{     
  for $seq in (1 to count($nodes))
  return $seq[$nodes[$seq] is $nodeToFind]
};

(:
  Recursively calculate the start elements with the other elements between
  as childs.
  Take the first two indices of $positions and create a start element
  with the elements of $elements with positions between these two indices.
  Then remove the first index of $position and do the recursive call.
  Input:
    $positions: Sequence with start element indices (belongs to $elements)
    $elements: Element sequence
  Output:
    Sequence of start elements with child elements
:)
declare function local:partition($positions as xs:integer*, 
    $elements as element()*) as element()* 
{
  let $len := count($positions)
  return
    if($len gt 1)
    then (
      let $first := $positions[1]
      let $second := $positions[2]
      let $rest := subsequence($positions, 2)
      return
        ( element start
          {
            subsequence($elements, $first + 1, $second - $first - 1)
          },
          local:partition($rest, $elements)
        )
    ) 
    else if($len eq 1)
    then (
          element start
          {
            subsequence($elements, $positions[1] + 1)
          }
    )
    else () 
};

(: Input document :)
let $input-doc := doc('in.xml')

(: Sequence of all child elements of root element doc :)
let $childs := $input-doc/doc/node()[. instance of element()]

(: Sequence with the indices of the start elements in $childs :)
let $positions := for $s in $input-doc/doc/start 
                  return functx:index-of-node($childs, $s)

return 
  <doc>
  {
    local:partition($positions, $childs)
  }
  </doc>

输出为：

<doc>
  <start>
    <a/>
    <b/>
    <item/>
    <item/>
    <item/>
  </start>
  <start>
    <item/>
    <item/>
    <item/>
  </start>
  <start>
    <b/>
    <item/>
    <item/>
    <item/>
  </start>
</doc>

Testet with XQilla 但所有其他 XQuery 处理器都应该产生相同的结果。

Assuming the input XML is in file in.xml this XQuery script does what you want:

(:
  This library function can be found here:
  http://www.xqueryfunctions.com/xq/functx_index-of-node.html
:)
declare namespace functx = "http://www.functx.com"; 
declare function functx:index-of-node($nodes as node()* ,
$nodeToFind as node() )  as xs:integer* 
{     
  for $seq in (1 to count($nodes))
  return $seq[$nodes[$seq] is $nodeToFind]
};

(:
  Recursively calculate the start elements with the other elements between
  as childs.
  Take the first two indices of $positions and create a start element
  with the elements of $elements with positions between these two indices.
  Then remove the first index of $position and do the recursive call.
  Input:
    $positions: Sequence with start element indices (belongs to $elements)
    $elements: Element sequence
  Output:
    Sequence of start elements with child elements
:)
declare function local:partition($positions as xs:integer*, 
    $elements as element()*) as element()* 
{
  let $len := count($positions)
  return
    if($len gt 1)
    then (
      let $first := $positions[1]
      let $second := $positions[2]
      let $rest := subsequence($positions, 2)
      return
        ( element start
          {
            subsequence($elements, $first + 1, $second - $first - 1)
          },
          local:partition($rest, $elements)
        )
    ) 
    else if($len eq 1)
    then (
          element start
          {
            subsequence($elements, $positions[1] + 1)
          }
    )
    else () 
};

(: Input document :)
let $input-doc := doc('in.xml')

(: Sequence of all child elements of root element doc :)
let $childs := $input-doc/doc/node()[. instance of element()]

(: Sequence with the indices of the start elements in $childs :)
let $positions := for $s in $input-doc/doc/start 
                  return functx:index-of-node($childs, $s)

return 
  <doc>
  {
    local:partition($positions, $childs)
  }
  </doc>

The output is:

<doc>
  <start>
    <a/>
    <b/>
    <item/>
    <item/>
    <item/>
  </start>
  <start>
    <item/>
    <item/>
    <item/>
  </start>
  <start>
    <b/>
    <item/>
    <item/>
    <item/>
  </start>
</doc>

Testet with XQilla but every other XQuery processor should produce the same result.

回复收藏 0 原文

~没有更多了~