从命令行合并多个 XML 文件
我有几个 xml 文件。它们都具有相同的结构,但由于文件大小而被分割。所以,假设我有 A.xml
、B.xml
、C.xml
和 D.xml
并且想要使用命令行工具将它们组合/合并到combined.xml
。
A.xml
<products>
<product id="1234"></product>
...
</products>
B.xml
<products>
<product id="5678"></product>
...
</products>
等
I have several xml files. They all have the same structure, but were splitted due to file size. So, let's say I have A.xml
, B.xml
, C.xml
and D.xml
and want to combine/merge them to combined.xml
, using a command line tool.
A.xml
<products>
<product id="1234"></product>
...
</products>
B.xml
<products>
<product id="5678"></product>
...
</products>
etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
高科技答案:
将此 Python 脚本另存为
xmlcombine.py
:要合并文件,请运行:
要进一步增强,请考虑使用:
chmod +x xmlcombine.py
:允许您在命令行中省略
python
xmlcombine.py !(combined).xml >组合.xml
:收集除输出之外的所有 XML 文件,但需要 bash 的
extglob
选项xmlcombine.py *.xml |海绵组合.xml
:也收集
combined.xml
中的所有内容,但需要sponge
程序import lxml.etree as ElementTree
:使用可能更快的 XML 解析器
High-tech answer:
Save this Python script as
xmlcombine.py
:To combine files, run:
For further enhancement, consider using:
chmod +x xmlcombine.py
:Allows you to omit
python
in the command linexmlcombine.py !(combined).xml > combined.xml
:Collects all XML files except the output, but requires bash's
extglob
optionxmlcombine.py *.xml | sponge combined.xml
:Collects everything in
combined.xml
as well, but requires thesponge
programimport lxml.etree as ElementTree
:Uses a potentially faster XML parser
xml_grep
http://search.cpan.org/dist/XML-Twig/工具/xml_grep/xml_grep
products
)product
)xml_grep
http://search.cpan.org/dist/XML-Twig/tools/xml_grep/xml_grep
products
)product
)低技术简单的答案:
限制:
combined.xml
的所有当前内容都将被清除,而不是被包含在内。这些限制中的每一个都可以解决,但并非所有限制都能轻松解决。
Low-tech simple answer:
Limitations:
combined.xml
will be wiped out instead of getting included.Each of these limitations can be worked around, but not all of them easily.
合并两棵树包括识别哪些是相同的以及哪些应该被替换的任务。不幸的是,这并不明显。所涉及的语义比从源 XML 文档中推断出的语义要多。
考虑这样的情况:第一个文档具有中间层,其中多个元素具有相同的标签,但属性不同。第二个文档将一个属性添加到现有元素的中间级别,同时还添加了另一个子元素。人们必须了解语义。
添加/合并:
Merging 2 trees includes the task to identify what is identical and what should be replaced. Unfortunately, this is not obvious. There is more semantic involved than what can be inferred from the source XML documents.
Consider the case where the first document has a middle level with several elements having the same tag, but different attributes. The second document adds an attribute to that middle level to an existing element, but also another child to it. One has to know the semantic.
add/merge:
另一个非常有用的工具是
yq
,其目标是jq
用于 YAML、TOML 和 XML。它可以通过 pip 安装,然后 xml 处理命令称为
xq
。Another very helpful tool is
yq
, which aims to bejq
for YAML, TOML and XML.It can be installed via pip, the xml handling command is then called
xq
.