如何使用 LXML 递归查找 XML 标签？

发布于 2024-08-30 20:37:00 字数 355 浏览 4 评论 0原文

<?xml version="1.0" ?>
<data>
    <test >
        <f1 />
    </test >
    <test2 >
        <test3>
         <f1 />
        </test3>
    </test2>
    <f1 />
</data>

使用 lxml 是否可以递归查找标签“ f1 ”？我尝试了 findall 方法，但它仅适用于直系孩子。

我想我应该为此选择 BeautifulSoup ！

原文

<?xml version="1.0" ?>
<data>
    <test >
        <f1 />
    </test >
    <test2 >
        <test3>
         <f1 />
        </test3>
    </test2>
    <f1 />
</data>

Using lxml is it possible to find recursively for tag " f1 "? I tried findall method but it works only for immediate children.

I think I should go for BeautifulSoup for this !!!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

仅冇旳回忆 2024-09-06 20:37:00

您可以使用 XPath 递归搜索：

>>> from lxml import etree
>>> q = etree.fromstring('<xml><hello>a</hello><x><hello>b</hello></x></xml>')
>>> q.findall('hello')     # Tag name, first level only.
[<Element hello at 414a7c8>]
>>> q.findall('.//hello')  # XPath, recursive.
[<Element hello at 414a7c8>, <Element hello at 414a818>]

You can use XPath to search recursively:

>>> from lxml import etree
>>> q = etree.fromstring('<xml><hello>a</hello><x><hello>b</hello></x></xml>')
>>> q.findall('hello')     # Tag name, first level only.
[<Element hello at 414a7c8>]
>>> q.findall('.//hello')  # XPath, recursive.
[<Element hello at 414a7c8>, <Element hello at 414a818>]

回复收藏 0 原文

爺獨霸怡葒院 2024-09-06 20:37:00

iterfind() 迭代与路径表达式匹配的所有元素

findall() 返回匹配元素的列表

find() 有效地仅返回第一个match

findtext() 返回第一个匹配的 .text 内容

说明性示例：

>>> root = etree.XML("<root><a x='123'>aText<b/><c/><b/></a></root>")
#Find a child of an Element:
>>> print(root.find("b"))
None
>>> print(root.find("a").tag)
a
#Find an Element anywhere in the tree:
>>> print(root.find(".//b").tag)
b
>>> [ b.tag for b in root.iterfind(".//b") ]
['b', 'b']
#Find Elements with a certain attribute:
>>> print(root.findall(".//a[@x]")[0].tag)
a
>>> print(root.findall(".//a[@y]"))
[]

参考：
http://lxml.de/tutorial.html#elementpath

（此答案是从此链接的内容中进行的相关选择性选择）

iterfind() iterates over all Elements that match the path expression

findall() returns a list of matching Elements

find() efficiently returns only the first match

findtext() returns the .text content of the first match

Illustrative Examples:

>>> root = etree.XML("<root><a x='123'>aText<b/><c/><b/></a></root>")
#Find a child of an Element:
>>> print(root.find("b"))
None
>>> print(root.find("a").tag)
a
#Find an Element anywhere in the tree:
>>> print(root.find(".//b").tag)
b
>>> [ b.tag for b in root.iterfind(".//b") ]
['b', 'b']
#Find Elements with a certain attribute:
>>> print(root.findall(".//a[@x]")[0].tag)
a
>>> print(root.findall(".//a[@y]"))
[]

Reference:
http://lxml.de/tutorial.html#elementpath

(This answer is relevant selective selection from the content at this link)

回复收藏 0 原文

~没有更多了~