将 XML 转换为 CSV - 详细信息
我在转换的正确表述方面遇到了一些麻烦。我正在生成 CSV 文件。 我可以轻松生成以下 csv:
"version","","stuff",
"version1version2","annotation1annotation2","yadda",
但是,我希望子字段的不同实例在其字符串中以逗号分隔,如下所示:
"version","","stuff",
"version1,version2","annotation1,annotation2","yadda",
我的输入看起来像
<?xml version="1.0" encoding="UTF-8"?>
<collection>
<record>
<datafield tag="020">
<subfield code="a">version</subfield>
</datafield>
<datafield tag="040">
<subfield code="b">stuff</subfield>
</datafield>
</record>
<record>
<datafield tag="020">
<subfield code="a">version1</subfield>
<subfield code="9">annotation1</subfield>
</datafield>
<datafield tag="020">
<subfield code="a">version2</subfield>
<subfield code="9">annotation2</subfield>
</datafield>
<datafield tag="040">
<subfield code="b">yadda</subfield>
</datafield>
</record>
</collection>
使用以下 xsl (和 xsltproc)
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="collection/record"/>
</xsl:template>
<xsl:template match="record">
<xsl:text>"</xsl:text>
<xsl:apply-templates select="datafield[@tag='020']/subfield[@code='a']"/>
<xsl:text>",</xsl:text>
<xsl:text>"</xsl:text>
<xsl:apply-templates select="datafield[@tag='020']/subfield[@code='9']"/>
<xsl:text>",</xsl:text>
<xsl:text>"</xsl:text>
<xsl:apply-templates select="datafield[@tag='040']/subfield[@code='b']"/>
<xsl:text>",</xsl:text>
<xsl:text>
</xsl:text>
</xsl:template>
我猜想某些组合将涉及以下兄弟姐妹:: 或 not(position()=last()) 与调用模板,但我还没有找到可行的解决方案。有什么帮助吗?
我并不是在寻找通用的 XML 到 csv 转换 - 任何适合这个特定数据集的东西都可以。
I have some trouble with the correct formulation of a transform. I'm generating CSV files.
I can easily generate the following csv:
"version","","stuff",
"version1version2","annotation1annotation2","yadda",
However, I would like for the different instances of subfields to be comma-separated within their string, as follows:
"version","","stuff",
"version1,version2","annotation1,annotation2","yadda",
My input looks something like
<?xml version="1.0" encoding="UTF-8"?>
<collection>
<record>
<datafield tag="020">
<subfield code="a">version</subfield>
</datafield>
<datafield tag="040">
<subfield code="b">stuff</subfield>
</datafield>
</record>
<record>
<datafield tag="020">
<subfield code="a">version1</subfield>
<subfield code="9">annotation1</subfield>
</datafield>
<datafield tag="020">
<subfield code="a">version2</subfield>
<subfield code="9">annotation2</subfield>
</datafield>
<datafield tag="040">
<subfield code="b">yadda</subfield>
</datafield>
</record>
</collection>
Using the following xsl (and xsltproc)
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="collection/record"/>
</xsl:template>
<xsl:template match="record">
<xsl:text>"</xsl:text>
<xsl:apply-templates select="datafield[@tag='020']/subfield[@code='a']"/>
<xsl:text>",</xsl:text>
<xsl:text>"</xsl:text>
<xsl:apply-templates select="datafield[@tag='020']/subfield[@code='9']"/>
<xsl:text>",</xsl:text>
<xsl:text>"</xsl:text>
<xsl:apply-templates select="datafield[@tag='040']/subfield[@code='b']"/>
<xsl:text>",</xsl:text>
<xsl:text>
</xsl:text>
</xsl:template>
I would guess that some combination of following-sibling:: or not(position()=last()) with call-template is going to be involved, but I haven't hit on a working solution yet. Any help?
I'm not looking for a generic XML-to-csv transform - anything geared to this particular dataset is fine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
对于较大的输入数据集,引入密钥将带来更好的整体性能:
For larger input data sets, introducing a key will result in better overall performance:
更一般地说,这个样式表:
结果:
In more general way, this stylesheet:
Result: