使用 cElementTree 在 python 中解析 XML 文件:处理文件中的错误和行号
我正在使用 cElementTree
库在 Python 中解析 XML 文件。 一切工作正常,
但当 XML 中的值不正确时,我想为用户提供完整的错误消息。
例如,假设我有以下 XML:
<A name="xxxx" href="yyyy"/>
并且想要告诉用户 href
属性是否不存在或具有不在给定列表中的值。
目前,我有一些类似
if elem.get("ref") not in myList:
raise XMLException( elem, "the 'href' attribute is not valid or does not exist")
我的异常在某处被捕获的情况。
但是,此外,我想显示文件中 XML 元素的行号。看来 cElementTree
不存储有关树的 XML 元素的行号的任何信息... :-(
问题: 是否有等效的 XML 库那能做到吗? 或者有一种方法可以访问 XML 文件中 XML 元素的位置?
谢谢
I am using the cElementTree
library to parse XML files in Python.
Everything is working fine
But I would like to provide full error messages for the user when a value in the XML is not correct.
For example, let's suppose I have the following XML:
<A name="xxxx" href="yyyy"/>
and want to tell the user if the href
attribute doesn't exist or have a value that is not in a given list.
For the moment, I have something like
if elem.get("ref") not in myList:
raise XMLException( elem, "the 'href' attribute is not valid or does not exist")
where my exception is caught somewhere.
But, in addition, I would like to display the line number of the XML element in the file. It seems that the cElementTree
doesn't store any information about the line numbers of the XML elements of the tree... :-(
Question: Is there an equivalent XML library that is able to do that?
Or a way to have access to the position of an XML element in the XML file ?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您应该使用的等效库是 lxml。 lxml 是非常快的 C 库 libxml2 和 libxslt 的包装器,通常被认为优于内置库。
幸运的是,它尝试保留元素树 api 并在 lxml.etree 中扩展它。
lxml.etree 对于所有元素都有一个属性 sourceline,这正是您所追求的。
因此上面的错误消息中的 elem.sourceline 应该可以工作。
The equivalent library that you should be using is lxml. lxml is a wrapper on very fast c libraries libxml2 and libxslt and is generally considered superior to the built in ones.
It, luckly, tries to keep to the element tree api and extend it in lxml.etree.
lxml.etree has an attribute sourceline for all elements which is just what you are after.
So
elem.sourceline
above in the error message should work.