在Python中将不同的数据类型从XML加载到字典中

发布于 2024-12-26 04:52:44 字数 905 浏览 4 评论 0原文

我使用 cElementTree 在循环中提取 xml 标签和值,然后将它们存储到字典中。

XML 文件包含:

<root>
    <tag1>['item1', 'item2']</tag1>
    <tag2>a normal string</tag2>
</root>

Python 代码(大致):

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    xmldata[xmltag.tag] = xmltag.text

我遇到的问题是 xml 文件包含不同的数据类型,其中包括 stringlist。不幸的是,Element.text 将所有 xml 值保存为 string(包括列表)。

因此,当我从 XML 文件加载时,我有:

{'tag1':"['item1', 'item2']", 'tag2':'a normal string'}

当我希望有:

{'tag1':['item1', 'item2'], 'tag2':'a normal string'}

有没有一种简单的方法可以做到这一点?
例如,以原始格式保存到字典的命令

或者我是否需要设置 if 语句来确定值类型并使用 Element.text 的替代方案单独保存它?

I'm using cElementTree to extract xml tags and values in a loop and then storing them into a dictionary.

XML file contains:

<root>
    <tag1>['item1', 'item2']</tag1>
    <tag2>a normal string</tag2>
</root>

Python code (roughly):

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    xmldata[xmltag.tag] = xmltag.text

The problem I have encountered is that the xml file contains different data types, which include string and list. Unfortunately Element.text saves all the xml values as string (including the lists).

So when I load from the XML file I have:

{'tag1':"['item1', 'item2']", 'tag2':'a normal string'}

When I'd prefer to have:

{'tag1':['item1', 'item2'], 'tag2':'a normal string'}

Is there an easy way to do this?
e.g a command that saves to the dictionary in the original format

Or do I need to set up if statements to determine the value type and save it seperately using an alternative to Element.text?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

不乱于心 2025-01-02 04:52:44

您可以使用literal_eval 尝试解析复杂的Python 文字。由于你的字符串没有被引用,它们会在 lteral eval 中引发 SyntaxError,但这很容易解决:

import xml.etree.cElementTree as xml
from ast import literal_eval

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    try:
        xmldata[xmltag.tag] = literal_eval(xmltag.text)
    except SyntaxError:
        xmldata[xmltag.tag] = xmltag.text

与 Python 的内置“eval”不同,ast.literal_eval 不允许执行表达式,因此是安全的,即使 XML数据来自不受信任的来源。

You can use literal_eval to try to parse complex python literals. Since your strigns are unquoted, they will raise a SyntaxError in lteral eval, but that is simle to work around:

import xml.etree.cElementTree as xml
from ast import literal_eval

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    try:
        xmldata[xmltag.tag] = literal_eval(xmltag.text)
    except SyntaxError:
        xmldata[xmltag.tag] = xmltag.text

Unlike Python's builtin "eval", ast.literal_eval does not allow the execution of expressions, and thus is safe, even if the XML data come from an untrusted source.

開玄 2025-01-02 04:52:44

这里是一个建议的解决方案:检查 [ 是否存在,然后解析列表。它不是万无一失的(如果分隔符不完全是带有空格的 , ,它将无法工作),但我认为您可以很容易地改进它。

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse("data.xml")
for xmltag in xmlfile.iter():
    # it's a list
    if "[" in xmltag.text:
        d = xmltag.text.lstrip("[").rstrip("]")
        l = [item.lstrip("'").rstrip("'") for item in d.split(", ")]
        xmldata[xmltag.tag] = l
    else:
        xmldata[xmltag.tag] = xmltag.text

print xmldata

打印:{'root': '\n', 'tag1': ['item1', 'item2'], 'tag2': '普通字符串'}

Here is a proposed solution: check for the existence of [, then parse the list. It's not failsafe (it won't work if the separator is not exactly , with a space) but I think that it'll be easy for you to improve it.

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse("data.xml")
for xmltag in xmlfile.iter():
    # it's a list
    if "[" in xmltag.text:
        d = xmltag.text.lstrip("[").rstrip("]")
        l = [item.lstrip("'").rstrip("'") for item in d.split(", ")]
        xmldata[xmltag.tag] = l
    else:
        xmldata[xmltag.tag] = xmltag.text

print xmldata

Prints: {'root': '\n', 'tag1': ['item1', 'item2'], 'tag2': 'a normal string'}

笨死的猪 2025-01-02 04:52:44

我认为您没有充分利用 xml 的强大功能!

为什么不这样组织你的 .xml

<root>
    <tag1>
        <item>item1</item>
        <item>item2</item>
    </tag1>
    <tag2>a normal string<tag2>
</root>

这样你的 python 代码将把每个 作为 < 的容器来处理/code>,我认为这样更好。

注意:您可能还想查看此处。 (我同意作者的“最喜欢的方式”)

I think that you are not using xml in all its mighty power!

Why don't you organize your .xml like:

<root>
    <tag1>
        <item>item1</item>
        <item>item2</item>
    </tag1>
    <tag2>a normal string<tag2>
</root>

This way your python code will be handling every <tag1> as a container of <item>, and I think that's better.

Note: You may also want to take a look here. (I agree with the "Favorite Way" of the author)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文