如何使用 groovy 在 XML 中搜索和替换?

发布于 2024-07-05 07:54:33 字数 284 浏览 8 评论 0原文

如何使用 groovy 在 XML 中进行搜索+替换?

我需要尽可能短/简单的东西,因为我将将此代码提供给测试人员用于他们的 SoapUI 脚本。

更具体地说,我如何将: 变成

<root><data></data></root>

<root><data>value</data></root>

How do I use groovy to search+replace in XML?

I need something as short/easy as possible, since I'll be giving this code to the testers for their SoapUI scripting.

More specifically, how do I turn:

<root><data></data></root>

into:

<root><data>value</data></root>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

一世旳自豪 2024-07-12 07:54:34

您可以使用 XSLT 执行的某些操作也可以使用某种形式的“搜索和搜索”执行。 代替'。 这完全取决于您的问题有多复杂以及您想要实施解决方案的“通用性”程度。 为了使您自己的示例更加通用:

xml.replaceFirst("<Mobiltlf>[^<]*</Mobiltlf>", '<Mobiltlf>32165487</Mobiltlf>')

您选择的解决方案取决于您。 根据我自己的经验(对于非常简单的问题),使用简单的字符串查找比使用正则表达式更快,而正则表达式又比使用成熟的 XSLT 转换更快(实际上是有意义的)。

Some of the stuff you can do with an XSLT you can also do with some form of 'search & replace'. It all depends on how complex your problem is and how 'generic' you want to implement the solution. To make your own example slightly more generic:

xml.replaceFirst("<Mobiltlf>[^<]*</Mobiltlf>", '<Mobiltlf>32165487</Mobiltlf>')

The solution you choose is up to you. In my own experience (for very simple problems) using simple string lookups is faster than using regular expressions which is again faster than using a fullblown XSLT transformation (makes sense actually).

窗影残 2024-07-12 07:54:34

经过一些疯狂的编码后,我看到了光明,并这样做了

import org.custommonkey.xmlunit.Diff
import org.custommonkey.xmlunit.XMLUnit

def input = '''<root><data></data></root>'''
def expectedResult = '''<root><data>value</data></root>'''

def xml = new XmlParser().parseText(input)

def p = xml.'**'.data
p.each{it.value="value"}

def writer = new StringWriter()
new XmlNodePrinter(new PrintWriter(writer)).print(xml)
def result = writer.toString()

XMLUnit.setIgnoreWhitespace(true)
def xmlDiff = new Diff(result, expectedResult)
assert xmlDiff.identical()

不幸的是,这不会保留原始 xml 文档中的注释和元数据等,所以我必须找到另一种方法

After some frenzied coding i saw the light and did like this

import org.custommonkey.xmlunit.Diff
import org.custommonkey.xmlunit.XMLUnit

def input = '''<root><data></data></root>'''
def expectedResult = '''<root><data>value</data></root>'''

def xml = new XmlParser().parseText(input)

def p = xml.'**'.data
p.each{it.value="value"}

def writer = new StringWriter()
new XmlNodePrinter(new PrintWriter(writer)).print(xml)
def result = writer.toString()

XMLUnit.setIgnoreWhitespace(true)
def xmlDiff = new Diff(result, expectedResult)
assert xmlDiff.identical()

Unfortunately this will not preserve the comments and metadata etc, from the original xml document, so i'll have to find another way

眼前雾蒙蒙 2024-07-12 07:54:34

这是迄今为止最好的答案,它给出了正确的结果,所以我将接受答案:)
然而,它对我来说有点太大了。 我想我最好解释一下替代方案是:

xml.replace("<Mobiltlf></Mobiltlf>", <Mobiltlf>32165487</Mobiltlf>")

但这不是很 xml'y,所以我想我会寻找替代方案。 另外,我不能确定第一个标签始终为空。

That's the best answer so far and it gives the right result, so I'm going to accept the answer :)
However, it's a little too large for me. I think i had better explain that the alternative is:

xml.replace("<Mobiltlf></Mobiltlf>", <Mobiltlf>32165487</Mobiltlf>")

But that's not very xml'y so I thought i'd look for an alternative. Also, I can't be sure that the first tag is empty all the time.

寻找一个思念的角度 2024-07-12 07:54:34

要保留属性,只需像这样修改您的小程序(我已经包含了一个示例源来测试它):

def input = """
<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://ementor.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf  type="national" anotherattribute="value"></Mobiltlf>
  <E-mail-adresse attr="whatever"></E-mail-adresse>
</application:FA_Ansoegning>
""".trim()

def rtv = { xmlSource, tagName, newValue ->
    regex = "(<$tagName[^>]*>)([^<]*)(</$tagName>)"
    replacement = "\$1${newValue}\$3"
    xmlSource = xmlSource.replaceAll(regex, replacement)
    return xmlSource
}

input = rtv( input, "Mobiltlf", "32165487" )
input = rtv( input, "E-mail-adresse", "[email protected]" )
println input

运行此脚本会生成:

<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://ementor.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf  type="national" anotherattribute="value">32165487</Mobiltlf>
  <E-mail-adresse attr="whatever">[email protected]</E-mail-adresse>
</application:FA_Ansoegning>

请注意,匹配的正则表达式现在包含 3 个捕获组: (1) 开始标记(包括属性) ,(2) 标签的“旧”内容是什么,以及 (3) 结束标签。 替换字符串通过 $i 语法引用这些捕获的组(使用反斜杠在 GString 中转义它们)。 提示:正则表达式是非常强大的动物,熟悉它们确实值得;-)。

To retain the attributes just modify your little program like this (I've included a sample source to test it):

def input = """
<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://ementor.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf  type="national" anotherattribute="value"></Mobiltlf>
  <E-mail-adresse attr="whatever"></E-mail-adresse>
</application:FA_Ansoegning>
""".trim()

def rtv = { xmlSource, tagName, newValue ->
    regex = "(<$tagName[^>]*>)([^<]*)(</$tagName>)"
    replacement = "\$1${newValue}\$3"
    xmlSource = xmlSource.replaceAll(regex, replacement)
    return xmlSource
}

input = rtv( input, "Mobiltlf", "32165487" )
input = rtv( input, "E-mail-adresse", "[email protected]" )
println input

Running this script produces:

<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://ementor.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf  type="national" anotherattribute="value">32165487</Mobiltlf>
  <E-mail-adresse attr="whatever">[email protected]</E-mail-adresse>
</application:FA_Ansoegning>

Note that the matching regexp now contains 3 capturing groups: (1) the start tag (including attributes), (2) whatever is the 'old' content of your tag and (3) the end tag. The replacement string refers to these captured groups via the $i syntax (with backslashes to escape them in the GString). Just a tip: regular expressions are very powerful animals, it's really worthwile to become familiar with them ;-) .

眼泪淡了忧伤 2024-07-12 07:54:34

我用 DOMCategory 做了一些测试,它几乎可以工作了。 我可以进行替换,但一些与信息路径相关的评论消失了。 我正在使用这样的方法:

def rtv = { xml, tag, value ->
    def doc     = DOMBuilder.parse(new StringReader(xml))
    def root    = doc.documentElement
    use(DOMCategory) { root.'**'."$tag".each{it.value=value} }
    return DOMUtil.serialize(root)    
}

在这样的源上:

<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://corp.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf></Mobiltlf>
  <E-mail-adresse></E-mail-adresse>
</application:FA_Ansoegning>

结果中唯一缺少的是结果中的

I did some some testing with DOMCategory and it's almost working. I can make the replace happen, but some infopath related comments disappear. I'm using a method like this:

def rtv = { xml, tag, value ->
    def doc     = DOMBuilder.parse(new StringReader(xml))
    def root    = doc.documentElement
    use(DOMCategory) { root.'**'."$tag".each{it.value=value} }
    return DOMUtil.serialize(root)    
}

on a source like this:

<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://corp.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf></Mobiltlf>
  <E-mail-adresse></E-mail-adresse>
</application:FA_Ansoegning>

The only thing missing from the result are the <?mso- lines from the result. Anyone with an idea for that?

作业与我同在 2024-07-12 07:54:34

杰出的! 非常感谢您的帮助:)

这以一种更干净、更简单的方式解决了我的问题。 它最终看起来像这样:

def rtv = { xmlSource, tagName, newValue ->
    regex = "<$tagName>[^<]*</$tagName>"
    replacement = "<$tagName>${newValue}</$tagName>"
    xmlSource = xmlSource.replaceAll(regex, replacement)
    return xmlSource
}

input = rtv( input, "Mobiltlf", "32165487" )
input = rtv( input, "E-mail-adresse", "[email protected]" )
println input

由于我将其提供给我们的测试人员以在他们的测试工具 SoapUI 中使用,因此我尝试“包装”它,以便他们更容易复制和粘贴。

这对于我的目的来说已经足够好了,但是如果我们可以再添加一个“扭曲”就完美了

假设输入中有这个……

<Mobiltlf type="national" anotherattribute="value"></Mobiltlf>

即使我们替换了值,我们也希望保留这两个属性。 有没有办法使用正则表达式来实现这一点?

Brilliant! Thank you very much for you assistance :)

That solves my problem in a much cleaner and easier way. It's ended up looking like this:

def rtv = { xmlSource, tagName, newValue ->
    regex = "<$tagName>[^<]*</$tagName>"
    replacement = "<$tagName>${newValue}</$tagName>"
    xmlSource = xmlSource.replaceAll(regex, replacement)
    return xmlSource
}

input = rtv( input, "Mobiltlf", "32165487" )
input = rtv( input, "E-mail-adresse", "[email protected]" )
println input

Since I'm giving this to our testers for use in their testing tool SoapUI, I've tried to "wrap" it, to make it easier for them to copy and paste.

This is good enough for my purpose, but it would be perfect if we could add one more "twist"

Let's say the input had this in it...

<Mobiltlf type="national" anotherattribute="value"></Mobiltlf>

...and we wanted to retain thos two attributes even though we replaced the value. Is there a way to use regexp for that too?

八巷 2024-07-12 07:54:34

http://groovy.codehaus.org/Processing 页面描述了更新 XML 的三种“官方”groovy 方法+XML,“更新 XML”部分。

在这三种方式中,似乎只有 DOMCategory 方式保留了 XML 注释等。

Three "official" groovy ways of updating XML are described on page http://groovy.codehaus.org/Processing+XML, section "Updating XML".

Of that three it seems only DOMCategory way preserves XML comments etc.

君勿笑 2024-07-12 07:54:34

对我来说,实际的副本& 搜索与 替换似乎是 XSLT 样式表的完美工作。 在 XSLT 中,您完全可以复制所有内容(包括您遇到问题的项目),然后将数据插入到需要的位置。 您可以通过 XSL 参数传递数据的特定值,也可以动态修改样式表本身(如果您将其作为字符串包含在 Groovy 程序中)。 在 Groovy 中调用此 XSLT 来转换文档非常简单。

我很快将以下 Groovy 脚本拼凑在一起(但我毫不怀疑它可以写得更简单/紧凑):

import javax.xml.transform.TransformerFactory
import javax.xml.transform.stream.StreamResult
import javax.xml.transform.stream.StreamSource

def xml = """
<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://ementor.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf></Mobiltlf>
  <E-mail-adresse></E-mail-adresse>
</application:FA_Ansoegning>
""".trim()

def xslt = """
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:param name="mobil" select="'***dummy***'"/>
    <xsl:param name="email" select="'***dummy***'"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="Mobiltlf">
        <xsl:copy>
            <xsl:value-of select="\$mobil"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="E-mail-adresse">
        <xsl:copy>
            <xsl:value-of select="\$email"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
""".trim()

def factory = TransformerFactory.newInstance()
def transformer = factory.newTransformer(new StreamSource(new StringReader(xslt)))

transformer.setParameter('mobil', '1234567890')
transformer.setParameter('email', '[email protected]')

transformer.transform(new StreamSource(new StringReader(xml)), new StreamResult(System.out))

运行此脚本会产生:

<?xml version="1.0" encoding="UTF-8"?><?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:application="http://ementor.dk/application/2007/06/22/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xd="http://schemas.microsoft.com/office/infopath/2003" xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf>1234567890</Mobiltlf>
  <E-mail-adresse>[email protected]</E-mail-adresse>
</application:FA_Ansoegning>

To me the actual copy & search & replace seems like the perfect job for an XSLT stylesheet. In an XSLT you have no problem at all to just copy everything (including the items you're having problems with) and simply insert your data where it is required. You can pass the specific value of your data in via an XSL parameter or you can dynamically modify the stylesheet itself (if you include as a string in your Groovy program). Calling this XSLT to transform your document(s) from within Groovy is very simple.

I quickly cobbled the following Groovy script together (but I have no doubts it can be written even more simple/compact):

import javax.xml.transform.TransformerFactory
import javax.xml.transform.stream.StreamResult
import javax.xml.transform.stream.StreamSource

def xml = """
<?xml version="1.0" encoding="utf-8"?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:application="http://ementor.dk/application/2007/06/22/"
xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf></Mobiltlf>
  <E-mail-adresse></E-mail-adresse>
</application:FA_Ansoegning>
""".trim()

def xslt = """
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:param name="mobil" select="'***dummy***'"/>
    <xsl:param name="email" select="'***dummy***'"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="Mobiltlf">
        <xsl:copy>
            <xsl:value-of select="\$mobil"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="E-mail-adresse">
        <xsl:copy>
            <xsl:value-of select="\$email"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
""".trim()

def factory = TransformerFactory.newInstance()
def transformer = factory.newTransformer(new StreamSource(new StringReader(xslt)))

transformer.setParameter('mobil', '1234567890')
transformer.setParameter('email', '[email protected]')

transformer.transform(new StreamSource(new StringReader(xml)), new StreamResult(System.out))

Running this script produces:

<?xml version="1.0" encoding="UTF-8"?><?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:FA_Ansoegning:http---ementor-dk-application-2007-06-22-" href="manifest.xsf" solutionVersion="1.0.0.14" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<application:FA_Ansoegning xmlns:application="http://ementor.dk/application/2007/06/22/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xd="http://schemas.microsoft.com/office/infopath/2003" xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/200    8-04-14T14:31:48">
    <Mobiltlf>1234567890</Mobiltlf>
  <E-mail-adresse>[email protected]</E-mail-adresse>
</application:FA_Ansoegning>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文