我应该使用哪个 Python XML 库?

发布于 2024-09-03 10:18:13 字数 323 浏览 5 评论 0原文

我将处理一个项目的 XML 文件。我早些时候决定使用 lxml,但在阅读了需求之后,我认为 ElemenTree 会更适合我的目的。

必须处理的 XML 文件有:

  1. 尺寸较小。通常< 10 KB。

  2. 没有命名空间。

  3. 简单的 XML 结构。

鉴于 XML 较小,内存不是问题。我唯一关心的是快速解析。

我应该带什么去?大多数情况下,我看到人们推荐 lxml,但考虑到我的解析要求,我真的能从中受益吗?或者 ElementTree 会更好地满足我的目的吗?

I am going to handle XML files for a project. I had earlier decided to use lxml but after reading the requirements, I think ElemenTree would be better for my purpose.

The XML files that have to be processed are:

  1. Small in size. Typically < 10 KB.

  2. No namespaces.

  3. Simple XML structure.

Given the small XML size, memory is not an issue. My only concern is fast parsing.

What should I go with? Mostly I have seen people recommend lxml, but given my parsing requirements, do I really stand to benefit from it or would ElementTree serve my purpose better?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

仙气飘飘 2024-09-10 10:18:13

正如其他人所指出的,lxml 实现了 ElementTree API,因此如果您需要更好的性能或更高级的功能,您可以安全地从 ElementTree 开始并迁移到 lxml。

使用 ElementTree 的一大优势(如果它满足您的需求)是,从 Python 2.5 开始,它是 Python 标准库的一部分,它减少了外部依赖以及处理编译/安装 C 模块的(可能的)头痛。

As others have pointed out, lxml implements the ElementTree API, so you're safe starting out with ElementTree and migrating to lxml if you need better performance or more advanced features.

The big advantage of using ElementTree, if it meets your needs, is that as of Python 2.5 it is part of the Python standard library, which cuts down on external dependencies and the (possible) headache of dealing with compiling/installing C modules.

む无字情书 2024-09-10 10:18:13

lxml 基本上是 ElementTree 的超集,因此您可以从 ElementTree 开始,然后如果您遇到性能或功能问题,则可以更改为 lxml。

性能问题只能由你使用自己的数据来研究,

lxml is basically a superset of ElementTree so you could start with ElementTree and then if you have performance or functionality issues then you could change to lxml.

Performance issues can only be studied by you using your own data,

人疚 2024-09-10 10:18:13

我推荐我自己的食谱

XML Python 数据结构 « Python 食谱 « ActiveState 代码

它不会加快解析速度。但它提供了真正本机对象样式的访问。

>>> SAMPLE_XML = """<?xml version="1.0" encoding="UTF-8"?>
... <address_book>
...   <person gender='m'>
...     <name>fred</name>
...     <phone type='home'>54321</phone>
...     <phone type='cell'>12345</phone>
...     <note>"A<!-- comment --><![CDATA[ <note>]]>"</note>
...   </person>
... </address_book>
... """
>>> address_book = xml2obj(SAMPLE_XML)
>>> person = address_book.person


person.gender        -> 'm'     # an attribute
person['gender']     -> 'm'     # alternative dictionary syntax
person.name          -> 'fred'  # shortcut to a text node
person.phone[0].type -> 'home'  # multiple elements becomes an list
person.phone[0].data -> '54321' # use .data to get the text value
str(person.phone[0]) -> '54321' # alternative syntax for the text value
person[0]            -> person  # if there are only one <person>, it can still
                                # be used as if it is a list of 1 element.
'address' in person  -> False   # test for existence of an attr or child
person.address       -> None    # non-exist element returns None
bool(person.address) -> False   # has any 'address' data (attr, child or text)
person.note          -> '"A <note>"'

I recommend my own recipe

XML to Python data structure « Python recipes « ActiveState Code

It does not speed up parsing. But it provides a really native object style access.

>>> SAMPLE_XML = """<?xml version="1.0" encoding="UTF-8"?>
... <address_book>
...   <person gender='m'>
...     <name>fred</name>
...     <phone type='home'>54321</phone>
...     <phone type='cell'>12345</phone>
...     <note>"A<!-- comment --><![CDATA[ <note>]]>"</note>
...   </person>
... </address_book>
... """
>>> address_book = xml2obj(SAMPLE_XML)
>>> person = address_book.person


person.gender        -> 'm'     # an attribute
person['gender']     -> 'm'     # alternative dictionary syntax
person.name          -> 'fred'  # shortcut to a text node
person.phone[0].type -> 'home'  # multiple elements becomes an list
person.phone[0].data -> '54321' # use .data to get the text value
str(person.phone[0]) -> '54321' # alternative syntax for the text value
person[0]            -> person  # if there are only one <person>, it can still
                                # be used as if it is a list of 1 element.
'address' in person  -> False   # test for existence of an attr or child
person.address       -> None    # non-exist element returns None
bool(person.address) -> False   # has any 'address' data (attr, child or text)
person.note          -> '"A <note>"'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文