如何使用 XSLT 样式表仅列出 XML 中的唯一对？

发布于 2024-12-17 18:51:31 字数 2848 浏览 0 评论 0原文

这是我的 XML 结构：

<dblp>

<inproceedings key="aaa" mdate="bbb">
<author>author1</author>
<author>author2</author>
<author>author3</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</inproceedings>

<article key="aaa" mdate="bbb">
<author>author1</author>
<author>author4</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</article>

<article key="aaa" mdate="bbb">
<author>author1</author>
<author>author2</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</article>

<inproceedings key="aaa" mdate="bbb">
<author>author2</author>
<author>author1</author>
<author>author5</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</inproceedings>

</dblp>

我需要显示合作撰写一篇文章（以及正在进行的文章）的所有作者。

因此，我们只需要列出独特的夫妇，以了解哪些作者合作过。这是我的 XSL，其中列出了所有对，但我需要添加一些代码来过滤选择并删除已列出的对：

<xsl:variable name="papers" select="dblp/*"/>
        <xsl:for-each select="$papers">
            <xsl:for-each select="author[position() != last()]">
                <xsl:variable name="a1" select="."/>
                <xsl:for-each select="following-sibling::author">
                    <xsl:value-of select="concat(translate(translate(translate($a1,' ','_'),'.',''),&quot;'&quot;,' '), '--', translate(translate(translate(.,' ','_'),'.',''),&quot;'&quot;,' '), ';&#10;')"/>
                </xsl:for-each>
            </xsl:for-each>
        </xsl:for-each>

当前输出：

author1--auhtor2
author1--auhtor3
author2--auhtor3
author1--auhtor4
author1--auhtor2
author2--auhtor1
author2--auhtor5
author1--auhtor5

输出应如下所示：

author1--auhtor2
author1--auhtor3
author2--auhtor3
author1--auhtor4
---
---
author2--auhtor5
author1--auhtor5

原文

This is my XML structure:

<dblp>

<inproceedings key="aaa" mdate="bbb">
<author>author1</author>
<author>author2</author>
<author>author3</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</inproceedings>

<article key="aaa" mdate="bbb">
<author>author1</author>
<author>author4</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</article>

<article key="aaa" mdate="bbb">
<author>author1</author>
<author>author2</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</article>

<inproceedings key="aaa" mdate="bbb">
<author>author2</author>
<author>author1</author>
<author>author5</author>
<title>Title of pubblications</title>
<pages>12345</pages>
<year>12345</year>
<crossref>sometext</crossref>
<booktitle>sometext</booktitle>
<url>sometext</url>
<ee>sometext</ee>
</inproceedings>

</dblp>

I need to display all couples of authors who have collaborated for an article (and inproceedings).

So we need to list only unique couples, to know wich authors have collaborated.
This is my XSL where I list all couples, but i need to add some code to filter the selection and remove the couples already listed:

<xsl:variable name="papers" select="dblp/*"/>
        <xsl:for-each select="$papers">
            <xsl:for-each select="author[position() != last()]">
                <xsl:variable name="a1" select="."/>
                <xsl:for-each select="following-sibling::author">
                    <xsl:value-of select="concat(translate(translate(translate($a1,' ','_'),'.',''),"'",' '), '--', translate(translate(translate(.,' ','_'),'.',''),"'",' '), ';
')"/>
                </xsl:for-each>
            </xsl:for-each>
        </xsl:for-each>

Current output:

author1--auhtor2
author1--auhtor3
author2--auhtor3
author1--auhtor4
author1--auhtor2
author2--auhtor1
author2--auhtor5
author1--auhtor5

The output should be like this:

author1--auhtor2
author1--auhtor3
author2--auhtor3
author1--auhtor4
---
---
author2--auhtor5
author1--auhtor5

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

说不完的你爱 2024-12-24 18:51:31

XSLT 2.0 解决方案：

<xsl:for-each-group select="/*/*/author" group-by=".">
    <xsl:sort select="current-grouping-key()"/>
    <xsl:variable name="firstKey" select="current-grouping-key()"></xsl:variable>
    <xsl:for-each-group select="/*/*/author[compare(.,  current-grouping-key()) = 1][some $x in (current-group()) satisfies $x/parent::* intersect ./parent::*]" group-by=".">
        <xsl:value-of select="concat($firstKey, '--',current-grouping-key(),';
')"></xsl:value-of>
    </xsl:for-each-group>    
</xsl:for-each-group>

An XSLT 2.0 solution :

<xsl:for-each-group select="/*/*/author" group-by=".">
    <xsl:sort select="current-grouping-key()"/>
    <xsl:variable name="firstKey" select="current-grouping-key()"></xsl:variable>
    <xsl:for-each-group select="/*/*/author[compare(.,  current-grouping-key()) = 1][some $x in (current-group()) satisfies $x/parent::* intersect ./parent::*]" group-by=".">
        <xsl:value-of select="concat($firstKey, '--',current-grouping-key(),';
')"></xsl:value-of>
    </xsl:for-each-group>    
</xsl:for-each-group>

回复收藏 0 原文

甜嗑 2024-12-24 18:51:31

为此，您可以使用 xslt 元素 xsl:for-each-group 或函数 unique-values()。

在下面的模板中，我将序列生成器放入名为 round1 的变量中，以便可以对其进行处理以删除重复项。我更改了内部循环以创建一个带有属性对（与您的配对匹配）的元素（协作）和一个名为 cannonicalPair 的有序版本。 cannonicalPair 用于消除按不同作者顺序执行的重复操作。请注意，有时协作的顺序在现实世界中很重要。

round1 变量后面是一系列删除重复项的循环。前两个显示您可以输出数据集中协作的任一顺序。如果协作的顺序不同，最后一个不会将协作视为重复。

  <xsl:template match="/">

<xsl:variable name="papers" select="dblp/*"/> 
<xsl:variable name="round1">
        <xsl:for-each select="$papers"> 
            <xsl:for-each select="author[position() != last()]"> 
                <xsl:variable name="a1" select="."/> 
                <xsl:for-each select="following-sibling::author"> 
                    <xsl:element name="collab">
                      <xsl:attribute name="pair"  select="concat(translate(translate(translate($a1,' ','_'),'.',''),"'",' '), '--', translate(translate(translate(.,' ','_'),'.',''),"'",' '), ';
')"/> 
                      <xsl:attribute name="cannonicalPair">
                        <xsl:choose>
                          <xsl:when test="$a1 lt ." >
                            <xsl:sequence select="concat(translate(translate(translate($a1,' ','_'),'.',''),"'",' '), '--', translate(translate(translate(.,' ','_'),'.',''),"'",' '), ';
')" />
                          </xsl:when>
                          <xsl:otherwise>
                            <xsl:sequence select="concat(translate(translate(translate(.,' ','_'),'.',''),"'",' '), '--', translate(translate(translate($a1,' ','_'),'.',''),"'",' '), ';
')" />
                          </xsl:otherwise>
                        </xsl:choose>
                      </xsl:attribute>  
                    </xsl:element>
                </xsl:for-each> 
            </xsl:for-each> 
        </xsl:for-each> 
</xsl:variable>

<xsl:text>
</xsl:text>

<xsl:for-each-group select="$round1/collab" group-by="@cannonicalPair">
  <xsl:value-of select="current-group()[1]/@pair" />
</xsl:for-each-group>

<xsl:text>---- listing seperator ----
</xsl:text>

<xsl:for-each-group select="$round1/collab" group-by="@cannonicalPair">
  <xsl:value-of select="current-group()[last()]/@pair" />
</xsl:for-each-group>

<xsl:text>---- listing seperator ----
</xsl:text>

<xsl:for-each select="distinct-values($round1/collab/@cannonicalPair)">
  <xsl:value-of select="." />
</xsl:for-each>

<xsl:text>---- listing seperator ----
</xsl:text>

<xsl:for-each select="distinct-values($round1/collab/@pair)">
  <xsl:value-of select="." />
</xsl:for-each>

You can use the xslt element xsl:for-each-group or the function distinct-values() for this.

In the template below I have put your sequence generator in a variable called round1 so it can be processed to remove duplicates. I have changed the inner loop to create an element (collab) with attributes pair (that matches your pairing) and an ordered version called cannonicalPair. To the cannonicalPair is used to eliminate duplicates do to different order of the authors. Note that sometimes the order of a collaboration is significant in the real world.

Following the round1 variable are a series of loops that remove the duplicates. The first two show that you can output either ordering of the collaboration in your data set. The last does not treat collaborations as duplicate if they have different order.

  <xsl:template match="/">

<xsl:variable name="papers" select="dblp/*"/> 
<xsl:variable name="round1">
        <xsl:for-each select="$papers"> 
            <xsl:for-each select="author[position() != last()]"> 
                <xsl:variable name="a1" select="."/> 
                <xsl:for-each select="following-sibling::author"> 
                    <xsl:element name="collab">
                      <xsl:attribute name="pair"  select="concat(translate(translate(translate($a1,' ','_'),'.',''),"'",' '), '--', translate(translate(translate(.,' ','_'),'.',''),"'",' '), ';
')"/> 
                      <xsl:attribute name="cannonicalPair">
                        <xsl:choose>
                          <xsl:when test="$a1 lt ." >
                            <xsl:sequence select="concat(translate(translate(translate($a1,' ','_'),'.',''),"'",' '), '--', translate(translate(translate(.,' ','_'),'.',''),"'",' '), ';
')" />
                          </xsl:when>
                          <xsl:otherwise>
                            <xsl:sequence select="concat(translate(translate(translate(.,' ','_'),'.',''),"'",' '), '--', translate(translate(translate($a1,' ','_'),'.',''),"'",' '), ';
')" />
                          </xsl:otherwise>
                        </xsl:choose>
                      </xsl:attribute>  
                    </xsl:element>
                </xsl:for-each> 
            </xsl:for-each> 
        </xsl:for-each> 
</xsl:variable>

<xsl:text>
</xsl:text>

<xsl:for-each-group select="$round1/collab" group-by="@cannonicalPair">
  <xsl:value-of select="current-group()[1]/@pair" />
</xsl:for-each-group>

<xsl:text>---- listing seperator ----
</xsl:text>

<xsl:for-each-group select="$round1/collab" group-by="@cannonicalPair">
  <xsl:value-of select="current-group()[last()]/@pair" />
</xsl:for-each-group>

<xsl:text>---- listing seperator ----
</xsl:text>

<xsl:for-each select="distinct-values($round1/collab/@cannonicalPair)">
  <xsl:value-of select="." />
</xsl:for-each>

<xsl:text>---- listing seperator ----
</xsl:text>

<xsl:for-each select="distinct-values($round1/collab/@pair)">
  <xsl:value-of select="." />
</xsl:for-each>

回复收藏 0 原文

攀登最高峰 2024-12-24 18:51:31

此 XSLT 2.0 转换：完整、简短且格式良好（27 行）：

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:key name="kAuthorBySibling" match="author"
     use="preceding-sibling::author | following-sibling::author"/>

    <xsl:variable name="vAuthors" as="element()*">
      <xsl:for-each select="distinct-values(/*/*/author)">
       <xsl:sort/>
       <a><xsl:value-of select="."/></a>
      </xsl:for-each>
    </xsl:variable>

 <xsl:template match="/">
     <xsl:sequence select=
      "for $a1 in $vAuthors,
         $doc in /,
           $a2 in $vAuthors
                   [. gt $a1
                  and
                    . = key('kAuthorBySibling', $a1, $doc)
                   ]

        return ($a1/string(), $a2/string(), '
')
      "/>
 </xsl:template>
</xsl:stylesheet>

应用于提供的 XML 文档时：

<dblp>
    <inproceedings key="aaa" mdate="bbb">
        <author>author1</author>
        <author>author2</author>
        <author>author3</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </inproceedings>
    <article key="aaa" mdate="bbb">
        <author>author1</author>
        <author>author4</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </article>
    <article key="aaa" mdate="bbb">
        <author>author1</author>
        <author>author2</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </article>
    <inproceedings key="aaa" mdate="bbb">
        <author>author2</author>
        <author>author1</author>
        <author>author5</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </inproceedings>
</dblp>

产生所需的正确结果 strong>：

 author1 author2
 author1 author3 
 author1 author4 
 author1 author5 
 author2 author3 
 author2 author5

解释：

变量$vAuthors是a元素的序列，其字符串值集是不同值的集中的 author 元素XML 文档。 a 元素按此顺序排序。
键'kAuthorBySibling'通过其任何同级元素（的字符串值）标识任何“author”元素。
对于$vAuthors中的每个$a1，我们得到$vAuthors中字符串值大于的任何$a2比 $a1 的元素高，并且字符串值等于 $a2 的 author 元素是 author 的同级元素字符串值等于的 元素$a1。为此，我们只需检查（使用一般相等运算符）$a2 的字符串值是否在 key('kAuthorBySibling', $a1, $doc)< 的字符串值集中/code>.

This XSLT 2.0 transformation: complete, short and well-formatted (27 lines):

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:key name="kAuthorBySibling" match="author"
     use="preceding-sibling::author | following-sibling::author"/>

    <xsl:variable name="vAuthors" as="element()*">
      <xsl:for-each select="distinct-values(/*/*/author)">
       <xsl:sort/>
       <a><xsl:value-of select="."/></a>
      </xsl:for-each>
    </xsl:variable>

 <xsl:template match="/">
     <xsl:sequence select=
      "for $a1 in $vAuthors,
         $doc in /,
           $a2 in $vAuthors
                   [. gt $a1
                  and
                    . = key('kAuthorBySibling', $a1, $doc)
                   ]

        return ($a1/string(), $a2/string(), '
')
      "/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<dblp>
    <inproceedings key="aaa" mdate="bbb">
        <author>author1</author>
        <author>author2</author>
        <author>author3</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </inproceedings>
    <article key="aaa" mdate="bbb">
        <author>author1</author>
        <author>author4</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </article>
    <article key="aaa" mdate="bbb">
        <author>author1</author>
        <author>author2</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </article>
    <inproceedings key="aaa" mdate="bbb">
        <author>author2</author>
        <author>author1</author>
        <author>author5</author>
        <title>Title of pubblications</title>
        <pages>12345</pages>
        <year>12345</year>
        <crossref>sometext</crossref>
        <booktitle>sometext</booktitle>
        <url>sometext</url>
        <ee>sometext</ee>
    </inproceedings>
</dblp>

produces the wanted, correct result:

 author1 author2
 author1 author3 
 author1 author4 
 author1 author5 
 author2 author3 
 author2 author5

Explanation:

The variable $vAuthors is a sequence of a elements whose set of string values is the set of distinct values of the author elements in the XML document. The a elements are sorted in this sequence.
The key 'kAuthorBySibling' identifies any `author element by (the string value of) any of its siblings.
For each $a1 in $vAuthors we get any $a2 in $vAuthors with string value greater than that of $a1 and such that an author element with string value equal to that of $a2 is a sibling of an author element with string value equal to that of $a1. For this we simply check (using the general equality operator) whether the string value of $a2 is in the set of string values of key('kAuthorBySibling', $a1, $doc).