XSLT 2.0 正则表达式问题(不同匹配的开始和结束元素)
我已经稍微简化了问题,但我希望我仍然抓住了问题的本质。
假设我有以下简单的 XML 文件:
<main>
outside1
===BEGIN===
inside1
====END====
outside2
=BEGIN=
inside2
==END==
outside3
</main>
然后我可以使用以下 XSLT 2.0:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="text()">
<xsl:analyze-string select="." regex="=+BEGIN=+">
<xsl:matching-substring>
<section/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:analyze-string select="." regex="=+END=+">
<xsl:matching-substring>
<_section/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
将其转换为以下内容:
<?xml version="1.0" encoding="UTF-8"?>
outside1
<section/>
inside1
<_section/>
outside2
<section/>
inside2
<_section/>
outside3
以下是问题:
多个正则表达式
是否有更好的方法来匹配两个不同的正则表达式,而不是将它们嵌套在另一个正则表达式中就像上面所做的那样?
- 如果它们不容易像这样嵌套怎么办?
- 我可以使用 XSL 模板来匹配和转换
text()
中的正则表达式匹配项吗?- 在本例中,我有两个模板,每个模板对应一个正则表达式
- 如果可能的话,这将是理想的解决方案
在正则表达式匹配上打开和关闭元素
显然,而不是:
<section/>
inside
<_section/>
我最终真正想要的是:
<section>
inside
</section>
那么如何你做这个吗?我不确定是否可以在一个正则表达式匹配中打开一个元素并在另一个正则表达式匹配中关闭它(即如果没有更接近的匹配怎么办?结果将不是格式良好的 XML!),但看起来像这项任务非常典型,必须有一个惯用的解决方案。
注意:我们可以假设部分不会重叠,因此也不会嵌套。我们还可以假设它们总是成对出现。
附加信息
所以本质上我试图完成 Perl 中简单的事情:
s/=+BEGIN=+/<section>/
s/=+END=+/<\/section>/
我正在寻找一种在 XSLT 中执行此操作的方法,因为:
- 对于正则表达式匹配的上下文,它会更加强大
- (即它应该只转换
text()
节点)
- (即它应该只转换
- 在匹配各种 XML 实体方面它也会更加健壮
I've simplified the problem somewhat, but I hope I've still captured the essence of my problem.
Let's say I have the following simple XML file:
<main>
outside1
===BEGIN===
inside1
====END====
outside2
=BEGIN=
inside2
==END==
outside3
</main>
Then I can use the following the XSLT 2.0:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="text()">
<xsl:analyze-string select="." regex="=+BEGIN=+">
<xsl:matching-substring>
<section/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:analyze-string select="." regex="=+END=+">
<xsl:matching-substring>
<_section/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
To transform it to the following:
<?xml version="1.0" encoding="UTF-8"?>
outside1
<section/>
inside1
<_section/>
outside2
<section/>
inside2
<_section/>
outside3
Here are the questions:
Multiple regexes
Is there a better way to match two different regexes rather than nesting them inside another like what was done above?
- What if they're not easily nestable like this?
- Can I have XSL templates to match and transform regex matches in a
text()
?- In this case, I'd have two templates, one for each regex
- If possible, this would be the ideal solution
Opening and closing elements on regex matches
Obviously, instead of:
<section/>
inside
<_section/>
What I really want eventually is:
<section>
inside
</section>
So how would you do this? I'm not sure if it's even possible to open an element in one regex match and close it in another (i.e. What if there is no match for the closer? The result will not be well-formed XML!), but it seems like this task is quite typical that there has to be an idiomatic solution for them.
Note: we can assume that sections will not overlap, and thus also will not nest. We can also assume that they will always appear in proper pairs.
Additional info
So essentially I'm trying to accomplish what in Perl would succintly be something like:
s/=+BEGIN=+/<section>/
s/=+END=+/<\/section>/
I'm looking for a way to do this in XSLT instead, because:
- It'd be more robust with regards to the context of the regex match
- (i.e. it should only transform
text()
nodes)
- (i.e. it should only transform
- It'd also be more robust with regards to matching various XML entities
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此转换:
应用于提供的 XML 文档时:
产生所需的结果:
This transformation:
when applied on the provided XML document:
produces the wanted result: