Python:libxml2 xpath 返回空列表
我想使用 xpath 使用 Python 的 libxml2 解析 XML 内容,我遵循 这个示例 和 < a href="http://www.zvon.org/xxl/XPathTutorial/General/examples.html" rel="nofollow">该教程。 XML 文件是:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://purl.org/atom/ns#" version="0.3">
<title>Gmail - Inbox for [email protected]</title>
<tagline>New messages in your Gmail Inbox</tagline>
<fullcount>1</fullcount>
<link rel="alternate" href="http://mail.google.com/mail" type="text/html"/>
<modified>2011-05-04T18:56:19Z</modified>
</feed>
此 XML 存储在名为“atom”的文件中,我尝试以下操作:
>>> import libxml2
>>> myfile = open('/pathtomyfile/atom', 'r').read()
>>> xmldata = libxml2.parseDoc('myfile')
>>> data.xpathEval('/fullcount')
[]
>>>
现在,如您所见,它返回一个空列表。无论我向 xpath 提供什么,它都会返回一个空列表。但是,如果我使用 *
通配符,我会得到所有节点的列表:
>>>> data.xpathEval('//*')
[<xmlNode (feed) object at 0xb73862cc>, <xmlNode (title) object at 0xb738650c>, <xmlNode (tagline) object at 0xb73865ec>, <xmlNode (fullcount) object at 0xb738660c>, <xmlNode (link) object at 0xb738662c>, <xmlNode (modified) object at 0xb738664c>]
现在我不明白,从上面的工作示例来看,为什么 xpath 找不到“fullcount”节点或任何其他:我毕竟使用相同的语法...
有什么想法或建议吗?谢谢。
I want to parse XML content with Python's libxml2 using xpath, i followed this example and that tutorial. The XML file is:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://purl.org/atom/ns#" version="0.3">
<title>Gmail - Inbox for [email protected]</title>
<tagline>New messages in your Gmail Inbox</tagline>
<fullcount>1</fullcount>
<link rel="alternate" href="http://mail.google.com/mail" type="text/html"/>
<modified>2011-05-04T18:56:19Z</modified>
</feed>
This XML is stored in a file called "atom", and i try the following:
>>> import libxml2
>>> myfile = open('/pathtomyfile/atom', 'r').read()
>>> xmldata = libxml2.parseDoc('myfile')
>>> data.xpathEval('/fullcount')
[]
>>>
Now as you can see it returns an empty list. No matter what i may provide xpath with, it will return an empty list. However, if i use the *
wildcard, i get a list of all nodes:
>>>> data.xpathEval('//*')
[<xmlNode (feed) object at 0xb73862cc>, <xmlNode (title) object at 0xb738650c>, <xmlNode (tagline) object at 0xb73865ec>, <xmlNode (fullcount) object at 0xb738660c>, <xmlNode (link) object at 0xb738662c>, <xmlNode (modified) object at 0xb738664c>]
Now i don't understand, judging from the working examples above, why xpath doesn't find the "fullcount" node or any other: i'm using the same syntax after all...
Any idea or suggestion? Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的 XPath 失败,因为您需要在节点上指定 purl 命名空间:
结果:(
另外:查看 lxml,它有一个更好、更高级别的接口)。
Your XPath is failing because you need to specify the purl namespace on the node:
Result:
(Also: check out lxml, it has a nicer, higher-level interface).
首先:
/fullcount
是绝对路径,因此它会在文档的根目录中查找
元素,当该元素实际上位于
元素内。其次:
您需要指定命名空间。这就是您使用 lxml 执行此操作的方法:
这将为您提供:
Firstly:
/fullcount
is an absolute path, so it's looking for the<fullcount>
element in the root of the document, when the element is in fact within the<feed>
element.Secondly:
You need to specify the namespace. This is how you would do it with lxml:
Which would give you: