在我的程序中,urlretrieve 从特定网站下载图像真的很慢
我正在使用 urllib.request 库中的 urlretrieve 从网站下载图像。 我的代码很慢。保存 4 张图像(64x64 和 png)花了 12 分钟。这是不正常的,因为我在其他网站上测试过它,而且它的工作速度更快(我的意思是一张图像 3 分钟是不正常的)。问题是来自网站还是我的计算机(我的网络很棒)。 以下是代码:
import urllib.request
from PIL import Image
import os.path
import json
#Load and edit latest crypto data for cards
with open("json/latest_crypto.json", 'r') as latest_crypto_json:
latest_crypto = json.load(latest_crypto_json)
del latest_crypto["status"]
for i in latest_crypto['data']:
logo_online_adress = "https://s2.coinmarketcap.com/static/img/coins/64x64/{}.png".format(i)
logo_local_adress = "misc/cryptoLogo/{}.png".format(i)
if not os.path.exists(logo_local_adress):
urllib.request.urlretrieve(logo_online_adress, logo_local_adress)
current_logo = Image.open(logo_local_adress)
if current_logo.size != (64, 64):
resized_logo = current_logo.resize((64,64))
resized_logo.save(logo_local_adress)
print(i+" import with resize")
else:
print(i+" import without resize")
else:
print(i+" already exist")
作为上下文,我正在从 CoinMarketCap 收集加密货币徽标,以便稍后在 HTML 代码中使用。
我将继续检查它是否已存在于目标文件夹中,如果不存在,我会获取它并根据需要调整大小。
这可能很混乱,但这条线周围的一切都按预期工作:
urllib.request.urlretrieve(logo_online_adress, logo_local_adress)
我唯一的问题是速度。我现在无法使用这个脚本,因为它太慢了。
I'm using urlretrieve from the urllib.request library to download images from a website.
My code is slow af. it took 12 min to save 4 images (64x64 and png). This isn't normal as i've tested it on other sites and it works way faster (i mean 3 minutes for one image is not normal). Is the problem coming from the website or my computer (i have a great network).
Here is the code :
import urllib.request
from PIL import Image
import os.path
import json
#Load and edit latest crypto data for cards
with open("json/latest_crypto.json", 'r') as latest_crypto_json:
latest_crypto = json.load(latest_crypto_json)
del latest_crypto["status"]
for i in latest_crypto['data']:
logo_online_adress = "https://s2.coinmarketcap.com/static/img/coins/64x64/{}.png".format(i)
logo_local_adress = "misc/cryptoLogo/{}.png".format(i)
if not os.path.exists(logo_local_adress):
urllib.request.urlretrieve(logo_online_adress, logo_local_adress)
current_logo = Image.open(logo_local_adress)
if current_logo.size != (64, 64):
resized_logo = current_logo.resize((64,64))
resized_logo.save(logo_local_adress)
print(i+" import with resize")
else:
print(i+" import without resize")
else:
print(i+" already exist")
For context, i'm collecting cryptocurrencies logo from CoinMarketCap for later use in HTML code.
I'm proceeding to a check to see if it already exist on the destination folder and if not, i get it and resize if it needs to.
This might be messy but everything around this line work as intended :
urllib.request.urlretrieve(logo_online_adress, logo_local_adress)
My only problem is speed. I can't use this script as it is right now cause it is way too slow.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以尝试使用curl 来获取图片,看看是否更快 - 如果不是,请尝试您的网络浏览器。
如果速度更快,那么您可能必须充当浏览器客户端,设置与浏览器相同的标头。
有些人做了很多事情来对抗其他人的自动化。
You could try to use curl to get the picture and see if that is faster - if not then try your web-browser.
If that is faster then You may have to pose as a browser-client, setting same headers as the browser does.
Some people do a lot to fight off other peoples automation.
你可以尝试这个,在我的例子中 requests 比 urllib 更快,
所以我将其写入:
写入https://stackoverflow.com/a/75261338/5053475
You can try with this, in my case requests was faster than urllib,
so I wrote this to:
https://stackoverflow.com/a/75261338/5053475