XSLT 是否提供了使用正则表达式来识别 xml 元素的方法?
我有一个示例 xml 文件,如下所示:
--- before transformation ---
<root-node>
<child-type-A> ... </child-type-A>
<child-type-A> ... </child-type-A>
<child-type-B> ... </child-type-B>
<child-type-C>
<child-type-B> ... </child-type-B>
...
</child-type-C>
...
</root-node>
我想将此 xml 文件转换为如下所示的文件:
--- after transformation ---
<root-node>
<child-node> ... </child-node>
<child-node> ... </child-node>
<child-node> ... </child-node>
<child-node>
<child-node> ... </child-node>
...
</child-node>
...
</root-node>
实际上,这意味着文档结构保持不变,但某些“选择的”元素被重命名。这些选定的元素以相同的前缀开头(在本例中为“child-type-”),但具有不同的后缀(“A”|“B”|“C”|等)。
为什么这么麻烦?我有一个软件需要 xml 文件作为输入。为了方便起见,我使用 XML 架构来轻松编辑 xml 文件,该架构有助于确保 xml 文件正确。遗憾的是,XML 模式在上下文敏感性方面有所欠缺。这导致 xml 文件看起来像 /before conversion/ 中所示。该软件无法处理此类 xml 文件,因为它需要 /after conversion/ 中所示的文件。因此需要进行改造。
我想使用 XSLT 进行转换,并且我已经知道如何做到这一点。我的方法是为身份转换定义一条规则,并为每个需要重命名的“child-type-*”元素定义一条规则。这个解决方案有效,但并不是那么优雅。你最终会得到很多规则。
--- sample transformation rules ---
<!-- Identity transformation -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="child-type-A">
<xsl:element name="child-node">
<xsl:apply-templates select="@*|node()" />
</xsl:element>
</xsl:template>
...
有没有办法将其压缩为两条规则?一种用于身份转换,另一种用于所有“child-type-*”元素?也许将 XSLT 与一些正则表达式结合使用?或者你必须采取不同的方法来解决这样的问题吗?
I have a sample xml file which looks like this:
--- before transformation ---
<root-node>
<child-type-A> ... </child-type-A>
<child-type-A> ... </child-type-A>
<child-type-B> ... </child-type-B>
<child-type-C>
<child-type-B> ... </child-type-B>
...
</child-type-C>
...
</root-node>
I want to transform this xml file into something that looks like that:
--- after transformation ---
<root-node>
<child-node> ... </child-node>
<child-node> ... </child-node>
<child-node> ... </child-node>
<child-node>
<child-node> ... </child-node>
...
</child-node>
...
</root-node>
Effectively that means that the document structure remains the same, but some 'chosen' elements are renamed. These chosen elements start with the same prefix (in this example with "child-type-") but have varying suffixes ("A" | "B" | "C" | etc.).
Why all this hassle? I have a software that demands an xml file as input. For sake of convenience I use an XML schema to easily edit an xml file and the schema helps making sure the xml file will be correct. Sadly XML schemas are lacking somewhat when it comes to aspects of context sensitivity. This leads to the xml file looking like shown in /before transformation/. The software cannot process such an xml file because it expects a file as shown in /after transformation/. Thus the need for the transformation.
I want to do the transformation with XSLT and I already figured out how to do so. My approach was to define a rule for an identity transformation and one rule for each "child-type-*" element which needs to be renamed. This solution works but it isn't that elegant though. You end up with lots of rules.
--- sample transformation rules ---
<!-- Identity transformation -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="child-type-A">
<xsl:element name="child-node">
<xsl:apply-templates select="@*|node()" />
</xsl:element>
</xsl:template>
...
Is there a way to condense that into only two rules? One for the identity transformation and one for all "child-type-*" elements? Maybe by using XSLT in combination with some regular expression? Or do you have to take a different approach to tackle such a problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这是一个通用的 XSLT 1.0 转换,它可以使用指定所需前缀的参数,以及对于每个所需前缀、后缀集,这样任何具有此前缀和这些后缀之一的元素名称应使用所需的新名称重命名:
应用于提供的 XML 文档时:
生成所需的正确结果:
请注意:使用在这种转换中,您可以同时重命名具有不同前缀的不同元素及其指定为外部参数/文档的关联后缀。
二.等效的 XSLT 2.0 解决方案:
当应用于相同的 XML 文档(如上)时,会再次生成相同的正确输出。
Here is a generic XSLT 1.0 transformation that could work with parameters that specify the desired predixes and, for each desired prefix, the set of suffixes, such that any element-name with this prefix and one of these suffixes should be renamed with a desired new name:
When applied on the provided XML document:
the wanted, correct result is produced:
Do note: Using this transformation you may rename simultaneously different elements with different prefixws and their associated suffixes specified as external parameters/documents.
II. Equivalent XSLT 2.0 solution:
when applied on the same XML document (above), again the same, correct output is produced.
(修改了我的答案)
此代码片段与您的示例 XML 配合得很好。我合并了这两个模板,因为它们都想对“所有元素”起作用。我之前的模板不起作用,因为它们都匹配相同的选择。
给定您的源 XML:
这会产生以下输出:
(Revised my answer)
This snippet works fine with your sample XML. I merged the two templates, because they both want to act on 'all elements'. My earlier templates didn't work because both matched the same selection.
Given your source XML of:
This results in the following output:
XSLtT 有一个 starts-with 函数,可以用来识别以
'child-type'
开头的元素允许您使用单个模板匹配。请参阅此相关问题:选择与开头匹配的元素名称
XSLtT has a starts-with function, which can be used to identify elements that start with
'child-type'
allowing you to use a single template match. See this related question:select the element which match the start-with name
通过将含义附加到元素名称的内部语法来捕获信息并不是一个好主意(在极端情况下,可能有一个 XML 文档,其中所有信息都以根元素的名称捕获,
)。但是,如果您拥有该形式的数据,那么当然可以对其进行处理,例如使用It's not a good idea to capture information by attaching meaning to the internal syntax of an element name (in extremis, one could have an XML document in which all the information was captured in the name of the root element,
<Surname_Kay.Firstname_Michael.Country_UK/>
). However, if you've got data in that form it's certainly possible to process it, for example with a template rule of the form<xsl:template match="*[matches(name(), 'child-type-[A-Z]')]">