使用分组将文本组合在一起然后进行测试

发布于 2024-08-20 03:14:03 字数 944 浏览 12 评论 0原文

因此,在这个糟糕的挤压排版产品中,我有时会看到链接和电子邮件地址被分开。示例:

<p>Here is some random text with an email address 
<Link>example</Link><Link>@example.com</Link> and here 
is more random text with a url 
<Link>http://www.</Link><Link>example.com</Link> near the end of the sentence.</p>

所需的输出:

<p>Here is some random text with an email address 
<email>[email protected]</email> and here is more random text 
with a url <ext-link ext-link-type="uri" xlink:href="http://www.example.com/">
http://www.example.com/</ext-link> near the end of the sentence.</p>

元素之间似乎不会出现空格,这是一件好事。

我知道我需要在 p 模板中使用 xsl:for-each-group,但我不太明白如何通过 contains() 函数将组中的组合文本放入其中,以便区分电子邮件和 URL。帮助?

So in this grotty extruded typesetting product, I sometimes see links and email addresses that have been split apart. Example:

<p>Here is some random text with an email address 
<Link>example</Link><Link>@example.com</Link> and here 
is more random text with a url 
<Link>http://www.</Link><Link>example.com</Link> near the end of the sentence.</p>

Desired output:

<p>Here is some random text with an email address 
<email>[email protected]</email> and here is more random text 
with a url <ext-link ext-link-type="uri" xlink:href="http://www.example.com/">
http://www.example.com/</ext-link> near the end of the sentence.</p>

Whitespace between the elements does not appear to occur, which is one blessing.

I can tell I need to use an xsl:for-each-group within the p template, but I can't quite see how to put the combined text from the group through the contains() function so as to distinguish emails from URLs. Help?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

韬韬不绝 2024-08-27 03:14:03

如果您使用组相邻,那么您可以简单地字符串连接 current-group() ,如下所示

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xsd"
  version="2.0">

  <xsl:template match="p">
    <xsl:copy>
      <xsl:for-each-group select="node()" group-adjacent="boolean(self::Link)">
        <xsl:choose>
          <xsl:when test="current-grouping-key()">
            <xsl:variable name="link-text" as="xsd:string" select="string-join(current-group(), '')"/>
            <xsl:choose>
              <xsl:when test="matches($link-text, '^https?://')">
                <ext-link ext-link-type="uri" xlink:href="{$link-text}">
                  <xsl:value-of select="$link-text"/>
                </ext-link>
              </xsl:when>
              <xsl:otherwise>
                <email><xsl:value-of select="$link-text"/></email>
              </xsl:otherwise>
            </xsl:choose>
          </xsl:when>
          <xsl:otherwise>
            <xsl:apply-templates select="current-group()"/>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each-group>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

If you use group-adjacent then you can simply string-join the current-group() as in

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xsd"
  version="2.0">

  <xsl:template match="p">
    <xsl:copy>
      <xsl:for-each-group select="node()" group-adjacent="boolean(self::Link)">
        <xsl:choose>
          <xsl:when test="current-grouping-key()">
            <xsl:variable name="link-text" as="xsd:string" select="string-join(current-group(), '')"/>
            <xsl:choose>
              <xsl:when test="matches($link-text, '^https?://')">
                <ext-link ext-link-type="uri" xlink:href="{$link-text}">
                  <xsl:value-of select="$link-text"/>
                </ext-link>
              </xsl:when>
              <xsl:otherwise>
                <email><xsl:value-of select="$link-text"/></email>
              </xsl:otherwise>
            </xsl:choose>
          </xsl:when>
          <xsl:otherwise>
            <xsl:apply-templates select="current-group()"/>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each-group>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>
落花浅忆 2024-08-27 03:14:03

这是基于身份模板的 XSLT 1.0 解决方案,对 元素进行了特殊处理。

<xsl:template match="node()|@*">
  <xsl:copy>
    <xsl:apply-templates select="node()|@*" />
  </xsl:copy>
</xsl:template>

<xsl:template match="Link">
  <xsl:if test="not(preceding-sibling::node()[1][self::Link])">
    <xsl:variable name="link">
      <xsl:copy-of select="
        text()
        | 
        following-sibling::Link[
          preceding-sibling::node()[1][self::Link]
          and
          generate-id(current())
          =
          generate-id(
            preceding-sibling::Link[
              not(preceding-sibling::node()[1][self::Link])
            ][1]
          )
        ]/text()
      " />
    </xsl:variable>
    <xsl:choose>
      <xsl:when test="contains($link, '://')">
        <ext-link ext-link-type="uri" xlink:href="{$link}" />
      </xsl:when>
      <xsl:when test="contains($link, '@')">
        <email>
          <xsl:value-of select="$link" />
        </email>
      </xsl:when>
      <xsl:otherwise>
        <link type="unknown">
          <xsl:value-of select="$link" />
        </link>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:if>
</xsl:template>

我知道使用的 XPath 表达式是一些相当棘手的怪物,但是在 XPath 1.0 中选择相邻的兄弟姐妹并不容易(如果有人有更好的想法如何在 XPath 1.0 中做到这一点,请继续告诉我)。

not(preceding-sibling::node()[1][self::Link])

表示“紧邻的前一个节点不能是 ”,例如:仅是“连续第一个”的 元素。

following-sibling::Link[
  preceding-sibling::node()[1][self::Link]
  and
  generate-id(current())
  =
  generate-id(
    preceding-sibling::Link[
      not(preceding-sibling::node()[1][self::Link])
    ][1]
  )
]

表示

  • 从所有后续同级中,选择那些
    • 立即关注(例如,它们不是“连续第一”),并且
    • current() 节点的 ID(始终是“连续第一个”的 )必须等于:
    • 最前面的本身就是“连续第一个”

如果这使得感觉。

应用到您的输入,我得到:

<p>Here is some random text with an email address
<email>[email protected]</email> and here
is more random text with a url
<ext-link ext-link-type="uri" xlink:href="http://www.example.com" /> near the end of the sentence.</p>

Here is an XSLT 1.0 solution based on the identity template, with special treatment for <Link> elements.

<xsl:template match="node()|@*">
  <xsl:copy>
    <xsl:apply-templates select="node()|@*" />
  </xsl:copy>
</xsl:template>

<xsl:template match="Link">
  <xsl:if test="not(preceding-sibling::node()[1][self::Link])">
    <xsl:variable name="link">
      <xsl:copy-of select="
        text()
        | 
        following-sibling::Link[
          preceding-sibling::node()[1][self::Link]
          and
          generate-id(current())
          =
          generate-id(
            preceding-sibling::Link[
              not(preceding-sibling::node()[1][self::Link])
            ][1]
          )
        ]/text()
      " />
    </xsl:variable>
    <xsl:choose>
      <xsl:when test="contains($link, '://')">
        <ext-link ext-link-type="uri" xlink:href="{$link}" />
      </xsl:when>
      <xsl:when test="contains($link, '@')">
        <email>
          <xsl:value-of select="$link" />
        </email>
      </xsl:when>
      <xsl:otherwise>
        <link type="unknown">
          <xsl:value-of select="$link" />
        </link>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:if>
</xsl:template>

I know that XPath expressions used are some quite a hairy monsters, but selecting adjacent siblings is not easy in XPath 1.0 (if someone has a better idea how to do it in XPath 1.0, go ahead and tell me).

not(preceding-sibling::node()[1][self::Link])

means "the immediately preceding node must not be a <Link>", e.g.: only <Link> elements that are "first in a row".

following-sibling::Link[
  preceding-sibling::node()[1][self::Link]
  and
  generate-id(current())
  =
  generate-id(
    preceding-sibling::Link[
      not(preceding-sibling::node()[1][self::Link])
    ][1]
  )
]

means

  • from all following-sibling <Link>s, choose the ones that
    • immediately follow a <Link> (e.g. they are not "first in a row"), and
    • the ID of the current() node (always a <Link> that's "first in a row") must be equal to:
    • the closest preceding <Link> that itself is "first in a row"

If that makes sense.

Applied to your input, I get:

<p>Here is some random text with an email address
<email>[email protected]</email> and here
is more random text with a url
<ext-link ext-link-type="uri" xlink:href="http://www.example.com" /> near the end of the sentence.</p>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文