我的python代码用于网络刮擦和下载五个图像不起作用。.使用Blender(3D)作为IDE
我正在运行此代码网络刮擦,并从Google下载5个图像:应该发生的是在我运行代码后,应该出现Chrome Web浏览器,并且该代码应该导致鼠标单击图像,然后下载它,然后向下滚动到另一个图像鼠标单击该图像,然后下载等等,最多五次。此代码中发生的事情是Chrome浏览器出现了一秒钟,没有其他任何事情发生……尽管代码不会引发任何python错误。
我使用Blender 3D建模软件作为我的IDE,因为我希望添加搅拌器(搅拌器附加组件有点像Google Chrome中的扩展名,这是您安装到Blender中的一小部分软件,以提高其功能)将来使用Python代码。这就是我代码顶部额外导入线的原因...
一个相关的项目是我得到此警告:
e:\ global Assets \ scrippting \ web刮擦图像\ web-scraper.blend \ web-scraper。 PY:21:debecationwarning:已弃用dopecate_path,请传递服务对象,
这是我运行代码后我控制台中唯一的其他信息:
DevTools listening on ws://127.0.0.1:52643/devtools/browser/ea448f70-0066-4d50-bfb8-8671528789b8
对此的任何帮助都将不胜感激。
import bpy
import subprocess
import sys
import os
import cv2
import random
from random import randrange
from PIL import Image #make sure both pil from c:\users\mjoe6\appdata\local\packages\pythonsoftwarefoundation.python.3.10_qbz5n2kfra8p0\localcache\local-packages\python310\site-packages is in blender pip3.exe folder
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
import io
import time
# path to python.exe
python_exe = os.path.join(sys.prefix, 'bin', 'python.exe')
py_lib = os.path.join(sys.prefix, 'lib', 'site-packages','pip')
PATH = "E:\\GLOBAL ASSETS\\SCRIPTING\\Web Scraping Images\\chromedriver.exe"
wd = webdriver.Chrome(PATH)
def get_images_from_google(wd, delay, max_images):
def scroll_down(wd):
wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(delay)
url = "https://www.google.com/search?q=cats+2019+IMDb&sxsrf=ALiCzsZmBIp-JZmZv23v6ORoc0VL2NRuxg:1654543304286&source=lnms&tbm=isch&sa=X&ved=2ahUKEwiRjquPxpn4AhUymo4IHbC6Cx0Q_AUoAnoECAEQBA&biw=1536&bih=714&dpr=1.25#imgrc=ixH-uoDQFN_gpM"
wd.get(url)
image_urls = set()
while len (image_urls) < max_images:
scroll_down(wd)
thumbnails = wd.find_elements(By.CLASS_NAME, "Q4LuWd")
for img in thumbnails[len(image_urls):max_images]:
try:
img.click()
time.sleep.delay
except:
continue
images = wd.find_elements(By.CLASS_NAME, "n3VNCb")
for image in images:
if image.get_attribute('src') and 'http' in image.get_attribute('src'):
image_urls.add(image.get_attribute('src'))
print(f"Found {len (image_urls)}")
return image_urls
def download_image(download_path, url, file_name):
try:
image_content = requests.get(url).content
image_file = io.BytesIO(image_content)
image = Image.open(image_file)
file_path = download_path + file_name
with open(file_path, "wb") as f:
image.save(f, "PNG")
print("Success")
except Exception as e:
print('FAILED -', e)
urls = get_images_from_google(wd, 1, 5)
print(urls)
wd.quit()
I am running this code too web scrape and download 5 images from Google: what's supposed to happen is after I run the code, a Chrome web browser is supposed to come up and the code is supposed to cause mouse click on an image and then download it and then Scroll down to another image mouse click that and download it etc, up to five times. What happens in this code is the Chrome browser comes up for a second closes and nothing else happens...though code doesn't throw up any Python errors.
I am using blender 3D modeling software as my IDE because I hope to make a Blender add on(a Blender add-on is kind of like an extension in Google Chrome it's a small piece of software that you install into Blender to increase its functionality) in the future using Python code. Which is the reason for the extra import lines at the top of my code...
One relevant item is that I get this warning:
E:\GLOBAL ASSETS\SCRIPTING\Web Scraping Images\web-scraper.blend\web-scraper.py:21: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
this is the only other piece of information that is in my console after I run the code:
DevTools listening on ws://127.0.0.1:52643/devtools/browser/ea448f70-0066-4d50-bfb8-8671528789b8
any help on this would be appreciated
import bpy
import subprocess
import sys
import os
import cv2
import random
from random import randrange
from PIL import Image #make sure both pil from c:\users\mjoe6\appdata\local\packages\pythonsoftwarefoundation.python.3.10_qbz5n2kfra8p0\localcache\local-packages\python310\site-packages is in blender pip3.exe folder
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
import io
import time
# path to python.exe
python_exe = os.path.join(sys.prefix, 'bin', 'python.exe')
py_lib = os.path.join(sys.prefix, 'lib', 'site-packages','pip')
PATH = "E:\\GLOBAL ASSETS\\SCRIPTING\\Web Scraping Images\\chromedriver.exe"
wd = webdriver.Chrome(PATH)
def get_images_from_google(wd, delay, max_images):
def scroll_down(wd):
wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(delay)
url = "https://www.google.com/search?q=cats+2019+IMDb&sxsrf=ALiCzsZmBIp-JZmZv23v6ORoc0VL2NRuxg:1654543304286&source=lnms&tbm=isch&sa=X&ved=2ahUKEwiRjquPxpn4AhUymo4IHbC6Cx0Q_AUoAnoECAEQBA&biw=1536&bih=714&dpr=1.25#imgrc=ixH-uoDQFN_gpM"
wd.get(url)
image_urls = set()
while len (image_urls) < max_images:
scroll_down(wd)
thumbnails = wd.find_elements(By.CLASS_NAME, "Q4LuWd")
for img in thumbnails[len(image_urls):max_images]:
try:
img.click()
time.sleep.delay
except:
continue
images = wd.find_elements(By.CLASS_NAME, "n3VNCb")
for image in images:
if image.get_attribute('src') and 'http' in image.get_attribute('src'):
image_urls.add(image.get_attribute('src'))
print(f"Found {len (image_urls)}")
return image_urls
def download_image(download_path, url, file_name):
try:
image_content = requests.get(url).content
image_file = io.BytesIO(image_content)
image = Image.open(image_file)
file_path = download_path + file_name
with open(file_path, "wb") as f:
image.save(f, "PNG")
print("Success")
except Exception as e:
print('FAILED -', e)
urls = get_images_from_google(wd, 1, 5)
print(urls)
wd.quit()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
错误可能是在这些行中,
更改
应该像在此行中一样
,应使用https更改http,
Error could be in these lines,
should change like
also in this line
should change http with https,