lxml 相当于 BeautifulSoup “OR”句法?
我正在将一些 html 解析代码从 BeautifulSoup 转换为 lxml。我试图找出以下 BeautifullSoup 语句的 lxml 等效语法:
soup.find('a', {'class': ['current zzt', 'zzt']})
基本上我想找到文档中具有“当前 zzt”或“zzt”类属性的所有“a”标签。 BeautifulSoup 允许传入一个列表、字典,甚至是正则表达式来执行匹配。
lxml 等效项是什么?
谢谢!
I'm converting some html parsing code from BeautifulSoup to lxml. I'm trying to figure out the lxml equivalent syntax for the following BeautifullSoup statement:
soup.find('a', {'class': ['current zzt', 'zzt']})
Basically I want to find all of the "a" tags in the document that have a class attribute of either "current zzt" or "zzt". BeautifulSoup allows one to pass in a list, dictionary, or even a regular express to perform the match.
What is the lxml equivalent?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不,lxml 不提供您正在寻找的“首先查找或返回 None”方法。如果需要,只需使用
(select(soup) 或 [None])[0]
即可,或者编写一个函数来为您完成此操作。好的,所以
soup.find('a')
确实会按照您的预期首先找到一个元素或 None 。问题是,它似乎不支持 CSSSelector 所需的丰富 XPath 语法。No, lxml does not provide the "find first or return None" method you're looking for. Just use
(select(soup) or [None])[0]
if you need that, or write a function to do it for you.Ok, so
soup.find('a')
would indeed find first a element or None as you expect. Trouble is, it doesn't appear to support the rich XPath syntax needed for CSSSelector.