从 xml 中表示的树创建子树 - python

发布于 2024-08-24 12:27:59 字数 459 浏览 5 评论 0原文

我有一个 XML(以树的形式),我需要从中创建子树。

例如:

<a>
  <b>
    <c>Hello</c>
  <d>
    <e>Hi</e>
</a>

子树将是

<root>
<a>
  <b>
    <c>Hello</c>
   </b>
</a>
<a>
  <d>
     <e>Hi</e>
  </d>
</a>
</root>

python 中最好的 XML 库是什么?任何已经做到这一点的算法也会有所帮助。注意:XML 文档不会那么大,它很容易适合内存。

I have an XML (in the form of tree), I require to create sub-tree out of it.

For ex:

<a>
  <b>
    <c>Hello</c>
  <d>
    <e>Hi</e>
</a>

Subtree would be

<root>
<a>
  <b>
    <c>Hello</c>
   </b>
</a>
<a>
  <d>
     <e>Hi</e>
  </d>
</a>
</root>

What is the best XML library in python to do it? Any algorithm that already does this would also be helpful. Note: the XML doc won't be that big, it will easily fit in memory.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

昵称有卵用 2024-08-31 12:27:59

ElementTree 是对于“阅读”和“写作”来说都很好而且简单。

您的第一个 XML 示例(我编辑您的问题只是为了添加格式,以便它可读!)无效,我假设缺少 bd 的关闭标签,如中所示你所说的“子树”(对我来说,它看起来一点也不像子树,但看起来确实是对你的第一个形式的重写)。

净的“漂亮化”问题(例如添加换行符和缩进以使生成的 XML 看起来很漂亮;-),如果我理解正确的话,这段代码应该执行您所要求的操作:

try:
  import xml.etree.cElementTree as et
  import cStringIO as sio
except ImportError:
  import xml.etree.ElementTree as et
  import StringIO as sio

xmlin = sio.StringIO('''<a>
  <b>
    <c>Hello</c>
  </b>
  <d>
    <e>Hi</e>
  </d>
</a>
''')

tin = et.parse(xmlin)
top = tin.getroot()
tou = et.ElementTree(et.Element('root'))
newtop = tou.getroot()
for child in top.getchildren():
  subtree = et.Element(top.tag)
  subtree.append(child)
  newtop.append(subtree)

import sys
tou.write(sys.stdout)

开始时的 try/ except 尝试使用模块的 C 版本在“普通”平台上可用,否则会回退到纯 Python 模块(对于 App Engine、Jython、IronPython 等)。

然后我根据给定的 XML 字符串构建两棵树 - tin,输入树; tou,输出元素,最初为空,除了根元素。

剩下的就是对 tin 根的所有子元素进行一个非常简单的循环:对于每个子元素,都会构建一个合适的子树并将其附加到 tou 根的子元素 - - 这就是全部了。

最后两行显示了生成的树(由于空格问题,不太漂亮,但就 XML 结构而言完全正确;-)。

ElementTree is good and simple for both "reading" and "writing".

Your first XML example (I edited your question just to add formatting so it would be readable!) is invalid, I assume missing close-tags for b and d as appear in what you call "the subtree" (which looks nothing like a subtree to me, but does look like it's intended as a rewrite of your first form).

Net of "prettyfication" issues (e.g. adding newlines and indents to make the resulting XML look pretty;-), this code should do what you're asking, if I understand you correctly:

try:
  import xml.etree.cElementTree as et
  import cStringIO as sio
except ImportError:
  import xml.etree.ElementTree as et
  import StringIO as sio

xmlin = sio.StringIO('''<a>
  <b>
    <c>Hello</c>
  </b>
  <d>
    <e>Hi</e>
  </d>
</a>
''')

tin = et.parse(xmlin)
top = tin.getroot()
tou = et.ElementTree(et.Element('root'))
newtop = tou.getroot()
for child in top.getchildren():
  subtree = et.Element(top.tag)
  subtree.append(child)
  newtop.append(subtree)

import sys
tou.write(sys.stdout)

The try/except at the start tries to use the C versions of the modules on "normal" platforms where they're available, fall back to the pure-Python modules otherwise (for App Engine, Jython, IronPython, ...).

Then I build two trees -- tin, the input one, from the XML string you're given; tou, the output one, initially empty except for the root element.

All the rest is a very simple loop on all subelements of tin's root: for each, a suitable subtree is built and appended to the subelements of tou's root -- that's all there is to it.

The last two lines show the resulting tree (not pretty, due to whitespace issues, but perfectly correct in terms of XML structure;-).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文