美丽的汤中有继续运作吗?
from unittest import skip
import requests
from bs4 import BeautifulSoup
from csv import writer
import openpyxl
wb = openpyxl.load_workbook('Book3.xlsx')
ws = wb.active
with open('mtbs.csv', 'w', encoding='utf8', newline='') as f_output:
csv_output = writer(f_output)
header = ['Code','Product Description']
csv_output.writerow(header)
for row in ws.iter_rows(min_row=1, min_col=1, max_col=1, values_only=True):
url = f"https://www.radwell.com/en-US/Buy/MITSUBISHI/MITSUBISHI/{row[0]}"
print(url)
req_page = requests.get(url)
soup = BeautifulSoup(req_page.content, 'html.parser')
div_techspec = soup.find('div', class_="minitabSection")
if 'minitabSection' not in url:
continue //does not work
code = div_techspec.find_all('li')
description1 = div_techspec.find_all('li')
description2 = div_techspec.find_all('li')
description3 = div_techspec.find_all('li')
info = [code[0].text, description1[1].text, description2[2].text, description3[3].text]
csv_output.writerow(info)
我目前正在尝试从某个网站收集数据。我有一个包含数百种产品代码的Excel表。但是,我目前正在取消的网站中不存在某些产品,并且循环停止运行。
我目前在此部分问题
如果在URL中没有“ MinitabSection”:继续
不存在的URL应该被跳过并继续运行其余的代码。我该如何实现?
from unittest import skip
import requests
from bs4 import BeautifulSoup
from csv import writer
import openpyxl
wb = openpyxl.load_workbook('Book3.xlsx')
ws = wb.active
with open('mtbs.csv', 'w', encoding='utf8', newline='') as f_output:
csv_output = writer(f_output)
header = ['Code','Product Description']
csv_output.writerow(header)
for row in ws.iter_rows(min_row=1, min_col=1, max_col=1, values_only=True):
url = f"https://www.radwell.com/en-US/Buy/MITSUBISHI/MITSUBISHI/{row[0]}"
print(url)
req_page = requests.get(url)
soup = BeautifulSoup(req_page.content, 'html.parser')
div_techspec = soup.find('div', class_="minitabSection")
if 'minitabSection' not in url:
continue //does not work
code = div_techspec.find_all('li')
description1 = div_techspec.find_all('li')
description2 = div_techspec.find_all('li')
description3 = div_techspec.find_all('li')
info = [code[0].text, description1[1].text, description2[2].text, description3[3].text]
csv_output.writerow(info)
I am currently trying to collect data from a certain website. I have an excel sheet containing hundreds of product codes. However some product does not exist in the website that I am currently scrapping from and the loop stops running.
I am currently having issues for this part if 'minitabSection' not in url: continue
URL that does not exist should be skipped and continue running the rest of codes. How do I achieve this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不确定URL中是否有字符串
minitabSection
- 尝试find()
div_techspec
您应该检查其结果:或其他四处走动:
Not sure if there is a string
minitabSection
in your url - While you try tofind()
thediv_techspec
you should check its result:or other way around: