web用beautifuresoup .find()总是返回
DOM的相关部分: dom的
截图
from bs4 import BeautifulSoup
import requests
URL = 'https://www.cheapflights.com.sg/flight-search/SIN-KUL/2022-06-04?sort=bestflight_a&attempt=3&lastms=1653844067064'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
flight = soup.find('div', class_= 'resultWrapper')
print(flight)
屏幕 )被执行永远无。我尝试更改为具有不同类名称的DIV标签,但它仍然总是没有返回。 汤似乎很好,因为当我执行印刷品(汤)时,它返回了DOM的文本版本,因此问题似乎是关于下一行的
任何建议,我如何获得其他东西?谢谢你!
Relevant part of the DOM:
Screenshot of the DOM
This is the code I wrote:
from bs4 import BeautifulSoup
import requests
URL = 'https://www.cheapflights.com.sg/flight-search/SIN-KUL/2022-06-04?sort=bestflight_a&attempt=3&lastms=1653844067064'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
flight = soup.find('div', class_= 'resultWrapper')
print(flight)
The result that I get whenever print(flight) is executed is always None. I have tried changing to div tags with different class names but it still always returns None.
The soup seems to be fine though because when I execute print(soup) it returns a text version of the DOM so the problem seems to be with the next line
Any suggestions on how I can get something other than None? Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
那是因为用户代理。如果我尝试在不更改默认用户代理的情况下卷曲页面,它将返回 页。
这样更改您的代码,以避免检测到您的程序:
That's because of the User-Agent. If I try to curl the page without changing the default User-Agent, it'll return this page.
Change your code like this, to avoid that your program gets detected: