XSLT 从 xml 文件中的所有 url 中删除查询字符串

发布于 2024-11-09 18:48:42 字数 1206 浏览 0 评论 0原文

我需要对 MRSS RSS 源中所有属性的查询字符串执行正则表达式样式替换,将它们精简为仅 url。我在这里使用建议尝试了一些事情:​​XSLT Replace function not found但无济于事

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss" type="application/rss+xml" rel="self" />
<title>How to and instructional videos from Videojug.com</title>
<description>Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
<link>http://www.videojug.com</link>
<item>
  <title>How To Calculate Median</title>
  <media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4?somequerystring" type="video/mp4" bitrate="1200" height="848" duration="169" width="480">
    <media:title>How To Calculate Median</media:title>
    ..
  </media:content>
</item>

任何建议确实有帮助

I need to perform a regular expression style replacement of querystrings from all the attributes in an MRSS RSS feed, stripping them down to just the url. I've tried a few things here using suggests from here: XSLT Replace function not found but to no avail

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss" type="application/rss+xml" rel="self" />
<title>How to and instructional videos from Videojug.com</title>
<description>Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
<link>http://www.videojug.com</link>
<item>
  <title>How To Calculate Median</title>
  <media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4?somequerystring" type="video/mp4" bitrate="1200" height="848" duration="169" width="480">
    <media:title>How To Calculate Median</media:title>
    ..
  </media:content>
</item>

any suggestions really helpful

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

肥爪爪 2024-11-16 18:48:42

如果您使用的是 XSLT 2.0,则可以使用 tokenize()

  <xsl:template match="media:content">
    <xsl:value-of select="tokenize(@url,'\?')[1]"/>
  </xsl:template>

这是仅更改 media:contenturl 属性的另一个示例:

  <xsl:template match="media:content">
    <media:content url="{tokenize(@url,'\?')[1]}">
      <xsl:copy-of select="@*[not(name()='url')]"/>
      <xsl:apply-templates/>
    </media:content>
  </xsl:template>

编辑

要处理实例中的所有url属性,并保持其他所有内容不变,请使用身份转换并仅使用<的模板覆盖它代码>@url。

这是示例 XML 的修改版本。我已将两个属性添加到 description 中进行测试。 attr 属性应保持不变,而应处理 url 属性。

XML

<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
  <channel>
    <atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss" type="application/rss+xml" rel="self"/>
    <title>How to and instructional videos from Videojug.com</title>
    <!-- added some attributes for testing -->
    <description attr="don't delete me!" url="http://www.test.com/foo?anotherquerystring">Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
    <link>http://www.videojug.com</link>
    <item>
      <title>How To Calculate Median</title>
      <media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4?somequerystring" type="video/mp4" bitrate="1200" height="848"
        duration="169" width="480">
        <media:title>How To Calculate Median</media:title>
        .. 
      </media:content>
    </item>
  </channel>
</rss>

XSLT

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!--Identity Transform-->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="@url">
    <xsl:attribute name="url">
      <xsl:value-of select="tokenize(.,'\?')[1]"/>
    </xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

输出(使用 Saxon 9.3.0.5)

<rss xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:media="http://search.yahoo.com/mrss/"
     version="2.0">
   <channel>
      <atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss"
                 type="application/rss+xml"
                 rel="self"/>
      <title>How to and instructional videos from Videojug.com</title>
      <!-- added some attributes for testing --><description attr="don't delete me!" url="http://www.test.com/foo">Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
      <link>http://www.videojug.com</link>
      <item>
         <title>How To Calculate Median</title>
         <media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4"
                        type="video/mp4"
                        bitrate="1200"
                        height="848"
                        duration="169"
                        width="480">
            <media:title>How To Calculate Median</media:title>
        .. 
      </media:content>
      </item>
   </channel>
</rss>

If you're using XSLT 2.0, you can use tokenize():

  <xsl:template match="media:content">
    <xsl:value-of select="tokenize(@url,'\?')[1]"/>
  </xsl:template>

Here's another example of only changing the url attribute of media:content:

  <xsl:template match="media:content">
    <media:content url="{tokenize(@url,'\?')[1]}">
      <xsl:copy-of select="@*[not(name()='url')]"/>
      <xsl:apply-templates/>
    </media:content>
  </xsl:template>

EDIT

To handle all url attributes in your instance, and leave everything else unchanged, use an identity transform and only override it with a template for @url.

Here's a modified version of your sample XML. I've added two attributes to description for testing. The attr attribute should be left untouched and the url attribute should be processed.

XML

<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
  <channel>
    <atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss" type="application/rss+xml" rel="self"/>
    <title>How to and instructional videos from Videojug.com</title>
    <!-- added some attributes for testing -->
    <description attr="don't delete me!" url="http://www.test.com/foo?anotherquerystring">Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
    <link>http://www.videojug.com</link>
    <item>
      <title>How To Calculate Median</title>
      <media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4?somequerystring" type="video/mp4" bitrate="1200" height="848"
        duration="169" width="480">
        <media:title>How To Calculate Median</media:title>
        .. 
      </media:content>
    </item>
  </channel>
</rss>

XSLT

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!--Identity Transform-->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="@url">
    <xsl:attribute name="url">
      <xsl:value-of select="tokenize(.,'\?')[1]"/>
    </xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

OUTPUT (Using Saxon 9.3.0.5)

<rss xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:media="http://search.yahoo.com/mrss/"
     version="2.0">
   <channel>
      <atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss"
                 type="application/rss+xml"
                 rel="self"/>
      <title>How to and instructional videos from Videojug.com</title>
      <!-- added some attributes for testing --><description attr="don't delete me!" url="http://www.test.com/foo">Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
      <link>http://www.videojug.com</link>
      <item>
         <title>How To Calculate Median</title>
         <media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4"
                        type="video/mp4"
                        bitrate="1200"
                        height="848"
                        duration="169"
                        width="480">
            <media:title>How To Calculate Median</media:title>
        .. 
      </media:content>
      </item>
   </channel>
</rss>
倥絔 2024-11-16 18:48:42

使用 XSLT 2.0,XSLT 中的字符串处理通常要容易得多,但在本例中,使用自 XSLT 1.0 以来就存在的 substring-before() 函数看起来很容易满足要求。

String handling in XSLT is generally a lot easier with XSLT 2.0, but in this case it looks easy enough to achieve the requirement using the substring-before() function which is present since XSLT 1.0.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文