Webscraping Python BS4问题未返回数据

发布于 2025-02-12 20:25:33 字数 552 浏览 0 评论 0原文

我在这里是新手，并且已经阅读了许多历史悠久的文章，但找不到我想要的东西。

我是Webscrap的新手，并成功地从少数网站上刮了数据。

但是，当我试图使用美丽的汤提取产品标题时，我对此代码有问题，但是在代码中没有返回数据，但在代码中遇到问题？任何帮助都将不胜感激：

from bs4 import BeautifulSoup
import requests
import pandas as pd

webpage = requests.get('https://groceries.asda.com/aisle/beer-wine-spirits/spirits/whisky/1215685911554-1215685911575-1215685911576')

sp = BeautifulSoup(webpage.content, 'html.parser')

title = sp.find_all('h3', class_='co-product__title')

print(title)

我认为我的问题位于find_all函数中，但是无法解决如何解决？

问候米兰

原文

I am new here and have had a read through much of the historic posts but cannot exactly find what I am looking for.

I am new to webscraping and have successfully scraped data from a handful of sites.

However I am having an issue with this code as I am trying to extract the titles of the products using beautiful soup but have an issue somewhere in the code as it is not returning the data? Any help would be appreciated:

from bs4 import BeautifulSoup
import requests
import pandas as pd

webpage = requests.get('https://groceries.asda.com/aisle/beer-wine-spirits/spirits/whisky/1215685911554-1215685911575-1215685911576')

sp = BeautifulSoup(webpage.content, 'html.parser')

title = sp.find_all('h3', class_='co-product__title')

print(title)

I assume my issue lies somewhere in the find_all function, however cannot quite work out how to resolve?

Regards
Milan

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

千秋岁 2025-02-19 20:25:34

您可以尝试使用此链接，它似乎可以提取您想要的信息：

from bs4 import BeautifulSoup
import requests

webpage = requests.get("https://groceries.asda.com/api/items/iconmetadata?request_origin=gi")

sp = BeautifulSoup(webpage.content, "html.parser")

print(sp)

让我知道这是否有帮助。

You could try to use this link, it seems to pull the information you desire:

from bs4 import BeautifulSoup
import requests

webpage = requests.get("https://groceries.asda.com/api/items/iconmetadata?request_origin=gi")

sp = BeautifulSoup(webpage.content, "html.parser")

print(sp)

Let me know if this helps.

回复收藏 0 原文

尸血腥色 2025-02-19 20:25:34

尝试以下操作：

from bs4 import BeautifulSoup
import requests
import pandas as pd

webpage = requests.get('https://groceries.asda.com/aisle/beer-wine-spirits/spirits/whisky/1215685911554-1215685911575-1215685911576')

sp = BeautifulSoup(webpage.content, 'html.parser')

title = sp.find_all('h3', {'class':'co-product__title'})

print(title[0])

另外，我

sp = BeautifulSoup(webpage.text, 'lxml')

还希望注意到这将返回具有该类别所有元素的列表。如果您只想第一个实例，请使用.find ie：

title = sp.find('h3', {'class':'co-product__title'})

对不起，在此游行中下雨，但是您无法使用Web Driver刮擦此数据，也可以直接调用API。您应该研究如何在Python中获得JS渲染JS。

Try this:

from bs4 import BeautifulSoup
import requests
import pandas as pd

webpage = requests.get('https://groceries.asda.com/aisle/beer-wine-spirits/spirits/whisky/1215685911554-1215685911575-1215685911576')

sp = BeautifulSoup(webpage.content, 'html.parser')

title = sp.find_all('h3', {'class':'co-product__title'})

print(title[0])

also i prefer

sp = BeautifulSoup(webpage.text, 'lxml')

Also note that this will return a list with all elements of that class. If you want just the first instance, use .find ie:

title = sp.find('h3', {'class':'co-product__title'})

Sorry to rain on this parade, but you wont be able to scrape this data with out a webdriver or You can call the api directly. You should research how to get post rendered js in python.

回复收藏 0 原文

~没有更多了~