带有返回空列表的Web Crapping
from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.flaconi.de/haare/kerastase/chronologiste/kerastase-chronologiste-bain-regenerant-haarshampoo.html?yoReviewsPage=2')
soup = BeautifulSoup(driver.page_source, 'lxml')
soup.find_all('div',class_='content-review')
# it always return empty list
# I want to scrap all of review contents from e.g "<div class="content-review" id="325243269"> Super Shampoo, meine Haare glänzt und sind sehr weich.
from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.flaconi.de/haare/kerastase/chronologiste/kerastase-chronologiste-bain-regenerant-haarshampoo.html?yoReviewsPage=2')
soup = BeautifulSoup(driver.page_source, 'lxml')
soup.find_all('div',class_='content-review')
# it always return empty list
# I want to scrap all of review contents from e.g "<div class="content-review" id="325243269"> Super Shampoo, meine Haare glänzt und sind sehr weich. ???? </div>"
I try multiple ways but it always return empty list.
How should I do in order to solve this problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您需要等到页面完全加载直到完全加载:
添加必要的libs:
输出:
Yo need to wait until page will completely loaded:
Add necessary libs:
OUTPUT:
Second option - find request with reviews and get data:
With same output
这里的主要问题是,您需要关闭位于 shadow dom 。
弹出图像:

Main issue here is that you need to close 'accept cookies' popup which is located in shadow DOM.
Popup image:
