在 python 中使用 BeautifulSoup 进行网页抓取
如何使用 json 模块从内联脚本中提供的 JSON 格式的数据中提取价格?
我尝试在 https://glomark.lk/top-crust-bread 中提取价格/p/13676 但我无法获得价格价值。
所以请帮我解决这个问题。
import requests
import json
import sys
sys.path.insert(0,'bs4.zip')
from bs4 import BeautifulSoup
user_agent = {
'User-agent': 'Mozilla/5.0 Chrome/35.0.1916.47'
}
headers = user_agent
url = 'https://glomark.lk/top-crust-bread/p/13676'
req = requests.get(url, headers = headers)
soup = BeautifulSoup(req.content, 'html.parser')
products = soup.find_all("div", class_ = "details col-12 col-sm-12
col-md-6 col-lg-5 col-xl-5")
for product in products:
product_name = product.h1.text
product_price = product.find(id = 'product-promotion-price').text
print(product_name)
print(product_price)
How can I use the json module to extract the price from provides the data in JSON
format in an inline script
?
I tried to extract the price in https://glomark.lk/top-crust-bread/p/13676
But I couldn't to get the price value.
So please help me to solve this.
import requests
import json
import sys
sys.path.insert(0,'bs4.zip')
from bs4 import BeautifulSoup
user_agent = {
'User-agent': 'Mozilla/5.0 Chrome/35.0.1916.47'
}
headers = user_agent
url = 'https://glomark.lk/top-crust-bread/p/13676'
req = requests.get(url, headers = headers)
soup = BeautifulSoup(req.content, 'html.parser')
products = soup.find_all("div", class_ = "details col-12 col-sm-12
col-md-6 col-lg-5 col-xl-5")
for product in products:
product_name = product.h1.text
product_price = product.find(id = 'product-promotion-price').text
print(product_name)
print(product_price)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以仅使用
requests
模块从隐藏 api 获取 json 数据(价格)。但产品名称不是动态的。输出:
完整工作代码:
输出:
You can grab json data(price) from hidden api using only
requests
module. But the product name is not dynamic.Output:
Full working code:
Output:
如前所述,内容是由
JavaScript
动态提供的,因此其中一种方法可能是直接从脚本标记中获取数据,这就是您在问题中已经弄清楚的内容。将为您提供包含产品信息的字典:
只需选择需要的信息,例如价格:
示例
输出
As mentioned content is provided dynamically by
JavaScript
so one of the approaches could be to grab the data directly from the script tag, what you already figured out in your question.will give you a dict with product information:
simply pick information is needed like price:
Example
Output