使用 XSLT 获取元素的值然后将其删除
我正在使用 XSLT 来清理一些与 InDesign 相关的 XML,以便在其他系统中使用。我需要能够从文本正文中嵌套的标签中获取值,然后将其删除。
具体来说,标题和署名嵌入在文本正文中。我需要能够提取这些并将它们放入标题标签中 - 我可以做到这一点,但我似乎无法将它们从身体中取出。
这是我的(简化的)XML:
<?xml version="1.0" encoding="UTF-8"?>
<k4Export xmlns="http://www.vjoon.com/K4Export/1.4.2">
<publication>
<id>107233722</id>
<name>NGM</name>
<origin>origin</origin>
<issue>
<article>
<textObjects>
<textObject>
<text>
<inlineTag name="Story">
<inlineTag name="body">
<inlineTag name="headline">The Headline</inlineTag> Lorem ipsum dolor sit amet,
consectetur adipiscing elit. <em>Vivamus mollis</em> ligula quis mi
blandit interdum. In rutrum imperdiet suscipit. Fusce interdum,
sem id scelerisque molestie, purus ligula fringilla sapien, nec
auctor velit eros eget felis. Duis eu tellus purus. Donec id viverra
neque.</inlineTag>
<inlineTag name="body">Donec nec nulla neque, sit amet placerat
elit. Nulla pulvinar elit sapien. Donec venenatis, arcu sed
pellentesque ultrices, neque mi sollicitudin elit, nec fermentum
eros nibh aliquam leo. Nam lectus neque, dapibus in scelerisque
in, fermentum nec ipsum.</inlineTag>
<inlineTag name="body">Sed sed <strong>congue</strong> neque. Nulla
nec ipsum vitae lacus consectetur convallis sed et nulla. Integer
posuere viverra felis, at pulvinar risus scelerisque ac. Aliquam a
orci ac est iaculis porta. Duis sollicitudin lectus sit amet velit
condimentum lobortis.
<inlineTag name="byline">-John Doe</inlineTag></inlineTag></inlineTag>
</text>
</textObject>
</textObjects>
</article>
</issue>
</publication>
</k4Export>
这是我用来转换的 XSLT。我可以将标题和署名放入标题中,但无法将其从内容中取出。我是一名 XSLT 菜鸟,因此任何建议将不胜感激。 textObject 元素遍布整个 XML 文档,因此我有意使用非常通用的 XPath 选择器来获取它们。
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:default="http://www.vjoon.com/K4Export/1.4.2"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
exclude-result-prefixes="default">
<!-- Output Content -->
<xsl:template match="/">
<html>
<head>
<title>Sample</title>
</head>
<body>
<!-- Headline-->
<xsl:variable name="headlines" select="//default:inlineTag[@name='headline']" />
<xsl:choose>
<xsl:when test="$headlines">
<xsl:for-each select="$headlines">
<h1 class="headline"><xsl:value-of select="node()"/></h1>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<h1 class="headline">Headline Absent</h1>
</xsl:otherwise>
</xsl:choose>
<!-- Bylines -->
<xsl:variable name="bylines" select="//default:inlineTag[@name='byline']" />
<xsl:choose>
<xsl:when test="$bylines">
<xsl:for-each select="$bylines">
<h2 class="byline"><xsl:value-of select="node()"/></h2>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<h2 class="byline">Byline Absent</h2>
</xsl:otherwise>
</xsl:choose>
<div id="content">
<!-- body -->
<xsl:variable name="bodies" select="//default:inlineTag[@name='body']" />
<xsl:choose>
<xsl:when test="$bodies">
<xsl:for-each select="$bodies">
<p><xsl:value-of select="node()"/></p>
</xsl:for-each>
</xsl:when>
</xsl:choose>
</div>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
I'm using XSLT to clean up some InDesign-related XML for use in other systems. I need to be able to grab the value from tags nested in the text body then remove them.
Specifically, the headline and byline come embedded in the text body. I need to be able to extract these and put them in header tags - I'm able to do that, but I can't seem to get them out of the body while I'm at it.
Here's my (simplified) XML:
<?xml version="1.0" encoding="UTF-8"?>
<k4Export xmlns="http://www.vjoon.com/K4Export/1.4.2">
<publication>
<id>107233722</id>
<name>NGM</name>
<origin>origin</origin>
<issue>
<article>
<textObjects>
<textObject>
<text>
<inlineTag name="Story">
<inlineTag name="body">
<inlineTag name="headline">The Headline</inlineTag> Lorem ipsum dolor sit amet,
consectetur adipiscing elit. <em>Vivamus mollis</em> ligula quis mi
blandit interdum. In rutrum imperdiet suscipit. Fusce interdum,
sem id scelerisque molestie, purus ligula fringilla sapien, nec
auctor velit eros eget felis. Duis eu tellus purus. Donec id viverra
neque.</inlineTag>
<inlineTag name="body">Donec nec nulla neque, sit amet placerat
elit. Nulla pulvinar elit sapien. Donec venenatis, arcu sed
pellentesque ultrices, neque mi sollicitudin elit, nec fermentum
eros nibh aliquam leo. Nam lectus neque, dapibus in scelerisque
in, fermentum nec ipsum.</inlineTag>
<inlineTag name="body">Sed sed <strong>congue</strong> neque. Nulla
nec ipsum vitae lacus consectetur convallis sed et nulla. Integer
posuere viverra felis, at pulvinar risus scelerisque ac. Aliquam a
orci ac est iaculis porta. Duis sollicitudin lectus sit amet velit
condimentum lobortis.
<inlineTag name="byline">-John Doe</inlineTag></inlineTag></inlineTag>
</text>
</textObject>
</textObjects>
</article>
</issue>
</publication>
</k4Export>
And here's the XSLT I'm using to transform. I'm able to get the headline and byline into the header, but I'm not able to get it out of the content. I'm an XSLT noob so any advice will be appreciated. The textObject elements come spread out all over the XML document, so I'm intentionally using very general XPath selectors to get to them.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:default="http://www.vjoon.com/K4Export/1.4.2"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
exclude-result-prefixes="default">
<!-- Output Content -->
<xsl:template match="/">
<html>
<head>
<title>Sample</title>
</head>
<body>
<!-- Headline-->
<xsl:variable name="headlines" select="//default:inlineTag[@name='headline']" />
<xsl:choose>
<xsl:when test="$headlines">
<xsl:for-each select="$headlines">
<h1 class="headline"><xsl:value-of select="node()"/></h1>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<h1 class="headline">Headline Absent</h1>
</xsl:otherwise>
</xsl:choose>
<!-- Bylines -->
<xsl:variable name="bylines" select="//default:inlineTag[@name='byline']" />
<xsl:choose>
<xsl:when test="$bylines">
<xsl:for-each select="$bylines">
<h2 class="byline"><xsl:value-of select="node()"/></h2>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<h2 class="byline">Byline Absent</h2>
</xsl:otherwise>
</xsl:choose>
<div id="content">
<!-- body -->
<xsl:variable name="bodies" select="//default:inlineTag[@name='body']" />
<xsl:choose>
<xsl:when test="$bodies">
<xsl:for-each select="$bodies">
<p><xsl:value-of select="node()"/></p>
</xsl:for-each>
</xsl:when>
</xsl:choose>
</div>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您想要这样的东西(请注意,
现在已被
替换,并且有不同的模板用于处理具有不同name
属性值的inlineTag
元素(特别是,empty-bodied-templates 不会将它们匹配的节点复制到输出)。除此之外,我没有尝试以其他方式重构或改进您的代码——它具有巨大的改进潜力。结果现在不包含标题或署名。
I think you want something like this (note that the
<xsl:for-each>
is now replaced by<xsl:apply-templates>
and there are different templates for processing theinlineTag
elements with different values of theirname
attribute. In particular, empty-bodied-templates do not copy the node they match to the output). Other than this, I haven't made any attempt to otherwise re-factor or improve your code -- it has big potential for improvement.The result now doesn't contain either the headline or byline.