python lxml etree.iterparse。检查当前元素是否符合XPath
我想阅读相当大的XML作为流。但是找不到任何使用我的旧XPathes找到元素的方法。 以前文件的大小适中,因此足以:
all_elements = []
for xpath in list_of_xpathes:
all_elements.append(etree.parse(file).getroot().findall(xpath))
现在我在iterparse上挣扎。理想情况下,解决方案是将当前元素的路径与所需的XPath进行比较:
import lxml.etree as et
xml_file = r"my.xml" # quite big xml, that i should read
xml_paths = ['/some/arbitrary/xpath', '/another/xpath']
all_elements = []
iter = et.iterparse(xml_file, events = ('end',))
for event, element in iter:
for xpath in xml_paths:
if element_complies_with_xpath(element, xpath):
all_elements.append(element)
break
如何使用lxml实现element_complies_with_xpath函数?
I would like to read quite big XML as a stream. But could not find any way to use my old XPathes to find elements.
Previously files were of moderate size, so in was enough to:
all_elements = []
for xpath in list_of_xpathes:
all_elements.append(etree.parse(file).getroot().findall(xpath))
Now I am struggling with iterparse. Ideally the solution would be to compare path of current element with desired xpath:
import lxml.etree as et
xml_file = r"my.xml" # quite big xml, that i should read
xml_paths = ['/some/arbitrary/xpath', '/another/xpath']
all_elements = []
iter = et.iterparse(xml_file, events = ('end',))
for event, element in iter:
for xpath in xml_paths:
if element_complies_with_xpath(element, xpath):
all_elements.append(element)
break
How is it possible to implement element_complies_with_xpath function using lxml?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果可以提取XPATH的第一部分,则其余部分可以如下测试。可以使用
<第一个元素名称>:< 的dist,而不是字符串列表。父元素也可以用作dict键。
完整的XPath:
/some/nutary/xpath
dict:
{'some':'./arbitrary/xpath'}
count()
xpath函数也可以使用If first part of the xpath can be extracted then the rest could be tested as follows. Instead of a list of strings, a dict of
<first element name>: <rest of the xpath>
could be used. Parent element could be used as dict key also.Full xpath:
/some/arbitrary/xpath
dict :
{'some': './arbitrary/xpath'}
count()
xpath function could be used also