从 URL 下载返回的 Zip 文件
如果我有一个 URL,当在 Web 浏览器中提交时,会弹出一个对话框来保存 zip 文件,我将如何在 Python 中捕获并下载这个 zip 文件?
If I have a URL that, when submitted in a web browser, pops up a dialog box to save a zip file, how would I go about catching and downloading this zip file in Python?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
据我所知,在 Python 2 中执行此操作的正确方法是:
当然,您需要使用
r.ok
检查 GET 是否成功。对于 python 3+,将 StringIO 模块添加到 io 模块并使用 BytesIO 代替StringIO: 这里是提及此更改的发行说明。
As far as I can tell, the proper way to do this in Python 2 is:
of course you'd want to check that the GET was successful with
r.ok
.For python 3+, sub the StringIO module with the io module and use BytesIO instead of StringIO: Here are release notes that mention this change.
大多数人建议使用
requests
(如果可用),并且requests
文档 建议使用此方法从网址下载和保存原始数据:由于答案询问如何下载并保存 zip 文件,我还没有进入有关读取 zip 文件的详细信息。请参阅下面的众多答案之一以了解可能性。
如果由于某种原因您无法访问
requests
,您可以使用urllib.request
。它可能不如上面的那么强大。最后,如果您仍在使用 Python 2,则可以使用 urllib2.urlopen。
Most people recommend using
requests
if it is available, and therequests
documentation recommends this for downloading and saving raw data from a url:Since the answer asks about downloading and saving the zip file, I haven't gone into details regarding reading the zip file. See one of the many answers below for possibilities.
If for some reason you don't have access to
requests
, you can useurllib.request
instead. It may not be quite as robust as the above.Finally, if you are using Python 2 still, you can use
urllib2.urlopen
.在 这篇博文,我已经让它只处理
请求
。奇怪的
stream
事情的要点是我们不需要调用content
对于大型请求,这需要立即处理所有请求,
堵塞内存。
stream
通过迭代数据来避免这种情况一次一大块。
With the help of this blog post, I've got it working with just
requests
.The point of the weird
stream
thing is so we don't need to callcontent
on large requests, which would require it to all be processed at once,
clogging the memory. The
stream
avoids this by iterating through the dataone chunk at a time.
以下是我在 Python 3 中要做的工作:
Here's what I got to work in Python 3:
将 .zip 文件保存到磁盘上某个位置的超轻量级解决方案(使用 Python 3.9):
Super lightweight solution to save a .zip file to a location on disk (using Python 3.9):
我来这里寻找如何保存 .bzip2 文件。让我将代码粘贴给其他可能会寻找此内容的人。
我只想按原样保存文件。
I came here searching how to save a .bzip2 file. Let me paste the code for others who might come looking for this.
I just wanted to save the file as is.
要么使用 urllib2.urlopen,要么你可以尝试使用优秀的
Requests
模块并避免 urllib2 的麻烦:
Either use urllib2.urlopen, or you could try using the excellent
Requests
module and avoid urllib2 headaches:感谢@yoavram提供上述解决方案,
我的 url 路径链接到压缩文件夹,并遇到 BADZipfile 错误
(文件不是zip文件),如果我尝试了几次就很奇怪
检索网址并突然解压缩,所以我稍微修改了解决方案
少量。按照此处使用 is_zipfile 方法一个>
Thanks to @yoavram for the above solution,
my url path linked to a zipped folder, and encounter an error of BADZipfile
(file is not a zip file), and it was strange if I tried several times it
retrieve the url and unzipped it all of sudden so I amend the solution a little
bit. using the is_zipfile method as per here
使用
requests、zipfile 和 io
python 包。特别是 BytesIO 函数用于将解压后的文件保留在内存中,而不是将其保存到驱动器中。
Use
requests, zipfile and io
python packages.Specially BytesIO function is used to keep the unzipped file in memory rather than saving it into the drive.