从电子商务网站Python刮擦数据

发布于 2025-02-12 14:31:51 字数 771 浏览 1 评论 0原文

我设法从eBay刮擦数据,并且正在尝试从另一个站点进行相同的操作,但是HTML代码的结构略有不同,因此我无法刮擦数据。 我正在尝试以下代码,

k = requests.get('https://www.skroutz.gr/plus-deals').text
soup=BeautifulSoup(k,'html.parser')
productlist = soup.find_all("li",{"class":"cf card\nadd-to-cart-cta"})
print(productlist)

我认为您可以看到的问题是因为在课堂上有一个折断。

我也试图从同一页面上刮擦标题,但我也无法。

这是链接:skroutz.gr/plus-deals

谢谢

I manage to scrape data from ebay and I am trying to do the same from another site but the structure of html code is slightly different and due to that I am not able to scrape the data.
I am trying with the following code

k = requests.get('https://www.skroutz.gr/plus-deals').text
soup=BeautifulSoup(k,'html.parser')
productlist = soup.find_all("li",{"class":"cf card\nadd-to-cart-cta"})
print(productlist)

I think the problem as you can see is because with in the class there is a line break.

enter image description here

Also I tried to scrape the title from the same page but I was not able as well.

enter image description here

This is the link : skroutz.gr/plus-deals

Thank you

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

2025-02-19 14:31:51

您不必指定元素的所有类,只需将它们捡起来指定一个类,例如cf,请尝试以下操作:

from bs4 import BeautifulSoup
import requests

html = requests.get('https://www.skroutz.gr/plus-deals').text
soup = BeautifulSoup(html,'html.parser')

products_list = soup.find_all("li",{"class":"cf"})
for product in products_list:
    print(product)

You don't have to specify all the classes of the element, you can just simply pick them up specifying one class like cf, try this:

from bs4 import BeautifulSoup
import requests

html = requests.get('https://www.skroutz.gr/plus-deals').text
soup = BeautifulSoup(html,'html.parser')

products_list = soup.find_all("li",{"class":"cf"})
for product in products_list:
    print(product)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文