使用Python同时从多个网站下载图像
我正在尝试使用 Python 通过互联网同时下载多个图像,并且我已经查看了几个选项,但没有一个选项令人满意。
我考虑过 pyCurl,但并不真正理解示例代码,对于像这样简单的任务来说,它似乎有点过分了。 urlgrabber 似乎是一个不错的选择,但文档说批量下载功能仍在开发中。 我在 urllib2 的文档中找不到任何内容。
是否有一个真正可行且更易于实施的选项?谢谢。
I'm trying to download multiple images concurrently using Python over the internet, and I've looked at several option but none of them seem satisfactory.
I've considered pyCurl, but don't really understand the example code, and it seems to be way overkill for a task as simple as this.
urlgrabber seems to be a good choice, but the documentation says that the batch download feature is still in development.
I can't find anything in the documentation for urllib2.
Is there an option that actually works and is simpler to implement? Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它并不花哨,但您可以使用 urllib.urlretrieve 以及运行它的线程或进程池。
因为它们正在等待网络 IO,所以您可以让多个线程同时运行 - 将 URL 和目标文件名放入
Queue.Queue
中,并让每个线程吸收它们。如果您使用多处理,那就更简单了 - 只需创建一个进程池,然后使用函数和可迭代参数调用 mypool.map 即可。标准库中没有线程池,但如果您需要避免启动单独的进程,您可以获取第三方模块。
It's not fancy, but you can use
urllib.urlretrieve
, and a pool of threads or processes running it.Because they're waiting on network IO, you can get multiple threads running concurrently - stick the URLs and destination filenames in a
Queue.Queue
, and have each thread suck them up.If you use multiprocessing, it's even easier - just create a
Pool
of processes, and callmypool.map
with the function and iterable of arguments. There isn't a thread pool in the standard library, but you can get a third party module if you need to avoid launching separate processes.