在Python中将不同的数据类型从XML加载到字典中
我使用 cElementTree
在循环中提取 xml 标签和值,然后将它们存储到字典中。
XML 文件包含:
<root>
<tag1>['item1', 'item2']</tag1>
<tag2>a normal string</tag2>
</root>
Python 代码(大致):
import xml.etree.cElementTree as xml
xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
xmldata[xmltag.tag] = xmltag.text
我遇到的问题是 xml 文件包含不同的数据类型,其中包括 string
和 list
。不幸的是,Element.text
将所有 xml 值保存为 string
(包括列表)。
因此,当我从 XML 文件加载时,我有:
{'tag1':"['item1', 'item2']", 'tag2':'a normal string'}
当我希望有:
{'tag1':['item1', 'item2'], 'tag2':'a normal string'}
有没有一种简单的方法可以做到这一点?
例如,以原始格式保存到字典的命令
或者我是否需要设置 if 语句来确定值类型并使用 Element.text
的替代方案单独保存它?
I'm using cElementTree
to extract xml tags and values in a loop and then storing them into a dictionary.
XML file contains:
<root>
<tag1>['item1', 'item2']</tag1>
<tag2>a normal string</tag2>
</root>
Python code (roughly):
import xml.etree.cElementTree as xml
xmldata = {}
xmlfile = xml.parse(XMLFile.xml)
for xmltag in xmlfile.iter():
xmldata[xmltag.tag] = xmltag.text
The problem I have encountered is that the xml file contains different data types, which include string
and list
. Unfortunately Element.text
saves all the xml values as string
(including the lists).
So when I load from the XML file I have:
{'tag1':"['item1', 'item2']", 'tag2':'a normal string'}
When I'd prefer to have:
{'tag1':['item1', 'item2'], 'tag2':'a normal string'}
Is there an easy way to do this?
e.g a command that saves to the dictionary in the original format
Or do I need to set up if statements to determine the value type and save it seperately using an alternative to Element.text
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用literal_eval 尝试解析复杂的Python 文字。由于你的字符串没有被引用,它们会在 lteral eval 中引发 SyntaxError,但这很容易解决:
与 Python 的内置“eval”不同,ast.literal_eval 不允许执行表达式,因此是安全的,即使 XML数据来自不受信任的来源。
You can use literal_eval to try to parse complex python literals. Since your strigns are unquoted, they will raise a SyntaxError in lteral eval, but that is simle to work around:
Unlike Python's builtin "eval", ast.literal_eval does not allow the execution of expressions, and thus is safe, even if the XML data come from an untrusted source.
这里是一个建议的解决方案:检查
[
是否存在,然后解析列表。它不是万无一失的(如果分隔符不完全是带有空格的,
,它将无法工作),但我认为您可以很容易地改进它。打印:
{'root': '\n', 'tag1': ['item1', 'item2'], 'tag2': '普通字符串'}
Here is a proposed solution: check for the existence of
[
, then parse the list. It's not failsafe (it won't work if the separator is not exactly,
with a space) but I think that it'll be easy for you to improve it.Prints:
{'root': '\n', 'tag1': ['item1', 'item2'], 'tag2': 'a normal string'}
我认为您没有充分利用 xml 的强大功能!
为什么不这样组织你的
.xml
:这样你的 python 代码将把每个- < 的容器来处理/code>,我认为这样更好。
作为注意:您可能还想查看此处。 (我同意作者的“最喜欢的方式”)
I think that you are not using xml in all its mighty power!
Why don't you organize your
.xml
like:This way your python code will be handling every
<tag1>
as a container of<item>
, and I think that's better.Note: You may also want to take a look here. (I agree with the "Favorite Way" of the author)