在UL标签下的所有LI标签中找到链接的问题
我正在尝试在UL标签HTML代码下的所有LI标签中获取链接
:
<div id="chapter-list" class="sbox" style="">
<ul>
<li>
<a href="https://example.com/manga/name/2">
<div class="chpbox">
<span class="chapternum">
Chapter 2 </span>
</div>
</a>
</li>
<li>
<a href="https://example.com/manga/name/1">
<div class="chpbox">
<span class="chapternum">
Chapter 1 </span>
</div>
</a>
</li>
</ul>
</div>
我写的代码:
from bs4 import BeautifulSoup
import requests
html_page = requests.get('https://example.com/manga/name/')
soup = BeautifulSoup(html_page.content, 'html.parser')
chapters = soup.find('div', {"id": "chapter-list"})
children = chapters.findChildren("ul" , recursive=False) # when printed, it gives the the whole ul content
for litag in children.find('li'):
print(litag.find("a")["href"])
当我尝试打印LI标签链接时,它会出现以下错误:
Traceback (most recent call last):
File "C:\0.py", line 12, in <module>
for litag in children.find('li'):
File "C:\Users\hs\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 2289, in __getattr__
raise AttributeError(
AttributeError: ResultSet object has no attribute 'find'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用
查找
在章节列表中找到ul
。然后find_all
在ul
中查找列表项。最后,再次使用find_all
再次查找每个列表项目中的链接并打印URL。这两种方法的详细信息可以在 find_all方法BS4上的文档。您可以使用get_text()
在每个链接上搜索 capternum 之后,以获取链接的文本,例如第1章
。按课堂搜索可以在(更新)代码:
输出:
参考:
You can use
find
to find theul
in the chapter list. And thenfind_all
to find the list items in theul
. Finally, usefind_all
again to find the links in each list item and print the URL. Details of these two methods can be found in find and find_all method documentation on bs4. You can use theget_text()
after searching by the classchapternum
on each link to get the link's text likeChapter 1
. Searching by class be found in bs4 documentation for searching element by class(Updated) Code:
Output:
References: