在Python中将不同的数据类型从XML加载到字典中

发布于 2024-12-26 04:52:44 字数 905 浏览 4 评论 0原文

我使用 cElementTree 在循环中提取 xml 标签和值，然后将它们存储到字典中。

XML 文件包含：

<root>
    <tag1>['item1', 'item2']</tag1>
    <tag2>a normal string</tag2>
</root>

Python 代码（大致）：

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    xmldata[xmltag.tag] = xmltag.text

我遇到的问题是 xml 文件包含不同的数据类型，其中包括 string 和 list。不幸的是，Element.text 将所有 xml 值保存为 string（包括列表）。

因此，当我从 XML 文件加载时，我有：

{'tag1':"['item1', 'item2']", 'tag2':'a normal string'}

当我希望有：

{'tag1':['item1', 'item2'], 'tag2':'a normal string'}

有没有一种简单的方法可以做到这一点？
例如，以原始格式保存到字典的命令

或者我是否需要设置 if 语句来确定值类型并使用 Element.text 的替代方案单独保存它？

原文

I'm using cElementTree to extract xml tags and values in a loop and then storing them into a dictionary.

XML file contains:

<root>
    <tag1>['item1', 'item2']</tag1>
    <tag2>a normal string</tag2>
</root>

Python code (roughly):

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    xmldata[xmltag.tag] = xmltag.text

The problem I have encountered is that the xml file contains different data types, which include string and list. Unfortunately Element.text saves all the xml values as string (including the lists).

So when I load from the XML file I have:

{'tag1':"['item1', 'item2']", 'tag2':'a normal string'}

When I'd prefer to have:

{'tag1':['item1', 'item2'], 'tag2':'a normal string'}

Is there an easy way to do this?
e.g a command that saves to the dictionary in the original format

Or do I need to set up if statements to determine the value type and save it seperately using an alternative to Element.text?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不乱于心 2025-01-02 04:52:44

您可以使用literal_eval 尝试解析复杂的Python 文字。由于你的字符串没有被引用，它们会在 lteral eval 中引发 SyntaxError，但这很容易解决：

import xml.etree.cElementTree as xml
from ast import literal_eval

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    try:
        xmldata[xmltag.tag] = literal_eval(xmltag.text)
    except SyntaxError:
        xmldata[xmltag.tag] = xmltag.text

与 Python 的内置“eval”不同，ast.literal_eval 不允许执行表达式，因此是安全的，即使 XML数据来自不受信任的来源。

You can use literal_eval to try to parse complex python literals. Since your strigns are unquoted, they will raise a SyntaxError in lteral eval, but that is simle to work around:

import xml.etree.cElementTree as xml
from ast import literal_eval

xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
    try:
        xmldata[xmltag.tag] = literal_eval(xmltag.text)
    except SyntaxError:
        xmldata[xmltag.tag] = xmltag.text

Unlike Python's builtin "eval", ast.literal_eval does not allow the execution of expressions, and thus is safe, even if the XML data come from an untrusted source.

回复收藏 0 原文

開玄 2025-01-02 04:52:44

这里是一个建议的解决方案：检查 [ 是否存在，然后解析列表。它不是万无一失的（如果分隔符不完全是带有空格的 , ，它将无法工作），但我认为您可以很容易地改进它。

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse("data.xml")
for xmltag in xmlfile.iter():
    # it's a list
    if "[" in xmltag.text:
        d = xmltag.text.lstrip("[").rstrip("]")
        l = [item.lstrip("'").rstrip("'") for item in d.split(", ")]
        xmldata[xmltag.tag] = l
    else:
        xmldata[xmltag.tag] = xmltag.text

print xmldata

打印：{'root': '\n', 'tag1': ['item1', 'item2'], 'tag2': '普通字符串'}

Here is a proposed solution: check for the existence of [, then parse the list. It's not failsafe (it won't work if the separator is not exactly , with a space) but I think that it'll be easy for you to improve it.

import xml.etree.cElementTree as xml

xmldata = {}
xmlfile = xml.parse("data.xml")
for xmltag in xmlfile.iter():
    # it's a list
    if "[" in xmltag.text:
        d = xmltag.text.lstrip("[").rstrip("]")
        l = [item.lstrip("'").rstrip("'") for item in d.split(", ")]
        xmldata[xmltag.tag] = l
    else:
        xmldata[xmltag.tag] = xmltag.text

print xmldata

Prints: {'root': '\n', 'tag1': ['item1', 'item2'], 'tag2': 'a normal string'}

回复收藏 0 原文

笨死的猪 2025-01-02 04:52:44

我认为您没有充分利用 xml 的强大功能！

为什么不这样组织你的 .xml ：

<root>
    <tag1>
        <item>item1</item>
        <item>item2</item>
    </tag1>
    <tag2>a normal string<tag2>
</root>

这样你的 python 代码将把每个作为 < 的容器来处理/code>，我认为这样更好。

注意：您可能还想查看此处。（我同意作者的“最喜欢的方式”）

I think that you are not using xml in all its mighty power!

Why don't you organize your .xml like:

<root>
    <tag1>
        <item>item1</item>
        <item>item2</item>
    </tag1>
    <tag2>a normal string<tag2>
</root>

This way your python code will be handling every <tag1> as a container of <item>, and I think that's better.

Note: You may also want to take a look here. (I agree with the "Favorite Way" of the author)

回复收藏 0 原文

~没有更多了~

关于作者

仙气飘飘

暂无简介

文章

29 人气

关注发私信

友情链接

文江博客

在Python中将不同的数据类型从XML加载到字典中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

在Python中将不同的数据类型从XML加载到字典中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。