创建 zip 存档以供即时下载

发布于 2024-07-24 01:02:32 字数 652 浏览 6 评论 0原文

在我正在开发的 Web 应用程序中，用户可以创建一个充满文件的文件夹的 zip 存档。代码如下：

files = torrent[0].files
    zipfile = z.ZipFile(zipname, 'w')
    output = ""

    for f in files:
        zipfile.write(settings.PYRAT_TRANSMISSION_DOWNLOAD_DIR + "/" + f.name, f.name)

downloadurl = settings.PYRAT_DOWNLOAD_BASE_URL + "/" + settings.PYRAT_ARCHIVE_DIR + "/" + filename
output = "Download <a href=\"" + downloadurl + "\">" + torrent_name + "</a>"
return HttpResponse(output)

但这会带来令人讨厌的副作用，即下载 zip 存档时需要长时间等待（10 秒以上）。可以跳过这个吗？是否可以将存档直接发送给用户，而不是将其保存到文件中？

我确实相信 torrentflux 提供了我正在谈论的这个 excat 功能。能够压缩 GB 的数据并在一秒钟内下载。

原文

In a web app I am working on, the user can create a zip archive of a folder full of files. Here here's the code:

files = torrent[0].files
    zipfile = z.ZipFile(zipname, 'w')
    output = ""

    for f in files:
        zipfile.write(settings.PYRAT_TRANSMISSION_DOWNLOAD_DIR + "/" + f.name, f.name)

downloadurl = settings.PYRAT_DOWNLOAD_BASE_URL + "/" + settings.PYRAT_ARCHIVE_DIR + "/" + filename
output = "Download <a href=\"" + downloadurl + "\">" + torrent_name + "</a>"
return HttpResponse(output)

But this has the nasty side effect of a long wait (10+ seconds) while the zip archive is being downloaded. Is it possible to skip this? Instead of saving the archive to a file, is it possible to send it straight to the user?

I do beleive that torrentflux provides this excat feature I am talking about. Being able to zip GBs of data and download it within a second.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

愛上了 2024-07-31 01:02:32

检查这个在 Django 中提供动态生成的 ZIP 存档

回复收藏 0 原文

给妤﹃绝世温柔 2024-07-31 01:02:32

正如 mandrake 所说，HttpResponse 的构造函数接受可迭代对象。

幸运的是，ZIP 格式可以一次性创建存档，中央目录记录位于文件的最末尾：

enter图片描述在这里

（图片来自维基百科）

幸运的是，zipfile 确实不会执行任何搜索。

这是我想出的代码。一些注意事项：

我使用此代码来压缩一堆 JPEG 图片。压缩它们是没有意义的，我仅使用 ZIP 作为容器。
内存使用量为 O(size_of_largest_file) 而不是 O(size_of_archive)。这对我来说已经足够了：许多相对较小的文件加起来可能会形成巨大的存档
此代码没有设置 Content-Length 标头，因此用户无法获得良好的进度指示。如果已知所有文件的大小，应该可以提前计算。
像这样直接向用户提供 ZIP 意味着恢复下载将不起作用。

所以，这里是：

import zipfile

class ZipBuffer(object):
    """ A file-like object for zipfile.ZipFile to write into. """

    def __init__(self):
        self.data = []
        self.pos = 0

    def write(self, data):
        self.data.append(data)
        self.pos += len(data)

    def tell(self):
        # zipfile calls this so we need it
        return self.pos

    def flush(self):
        # zipfile calls this so we need it
        pass

    def get_and_clear(self):
        result = self.data
        self.data = []
        return result

def generate_zipped_stream():
    sink = ZipBuffer()
    archive = zipfile.ZipFile(sink, "w")
    for filename in ["file1.txt", "file2.txt"]:
        archive.writestr(filename, "contents of file here")
        for chunk in sink.get_and_clear():
            yield chunk

    archive.close()
    # close() generates some more data, so we yield that too
    for chunk in sink.get_and_clear():
        yield chunk

def my_django_view(request):
    response = HttpResponse(generate_zipped_stream(), mimetype="application/zip")
    response['Content-Disposition'] = 'attachment; filename=archive.zip'
    return response

As mandrake says, constructor of HttpResponse accepts iterable objects.

Luckily, ZIP format is such that archive can be created in single pass, central directory record is located at the very end of file:

enter image description here

(Picture from Wikipedia)

And luckily, zipfile indeed doesn't do any seeks as long as you only add files.

Here is the code I came up with. Some notes:

I'm using this code for zipping up a bunch of JPEG pictures. There is no point compressing them, I'm using ZIP only as container.
Memory usage is O(size_of_largest_file) not O(size_of_archive). And this is good enough for me: many relatively small files that add up to potentially huge archive
This code doesn't set Content-Length header, so user doesn't get nice progress indication. It should be possible to calculate this in advance if sizes of all files are known.
Serving the ZIP straight to user like this means that resume on downloads won't work.

So, here goes:

import zipfile

class ZipBuffer(object):
    """ A file-like object for zipfile.ZipFile to write into. """

    def __init__(self):
        self.data = []
        self.pos = 0

    def write(self, data):
        self.data.append(data)
        self.pos += len(data)

    def tell(self):
        # zipfile calls this so we need it
        return self.pos

    def flush(self):
        # zipfile calls this so we need it
        pass

    def get_and_clear(self):
        result = self.data
        self.data = []
        return result

def generate_zipped_stream():
    sink = ZipBuffer()
    archive = zipfile.ZipFile(sink, "w")
    for filename in ["file1.txt", "file2.txt"]:
        archive.writestr(filename, "contents of file here")
        for chunk in sink.get_and_clear():
            yield chunk

    archive.close()
    # close() generates some more data, so we yield that too
    for chunk in sink.get_and_clear():
        yield chunk

def my_django_view(request):
    response = HttpResponse(generate_zipped_stream(), mimetype="application/zip")
    response['Content-Disposition'] = 'attachment; filename=archive.zip'
    return response

回复收藏 0 原文

超可爱的懒熊 2024-07-31 01:02:32

这是一个简单的 Django 视图函数，它将（作为示例）压缩 /tmp 中的任何可读文件并返回 zip 文件。

from django.http import HttpResponse
import zipfile
import os
from cStringIO import StringIO # caveats for Python 3.0 apply

def somezip(request):
    file = StringIO()
    zf = zipfile.ZipFile(file, mode='w', compression=zipfile.ZIP_DEFLATED)
    for fn in os.listdir("/tmp"):
        path = os.path.join("/tmp", fn)
        if os.path.isfile(path):
            try:
                zf.write(path)
            except IOError:
                pass
    zf.close()
    response = HttpResponse(file.getvalue(), mimetype="application/zip")
    response['Content-Disposition'] = 'attachment; filename=yourfiles.zip'
    return response

当然，只有当 zip 文件能够方便地装入内存时，这种方法才有效 - 如果不能，您将不得不使用磁盘文件（您试图避免这种情况）。在这种情况下，您只需将 file = StringIO() 替换为 file = open('/path/to/yourfiles.zip', 'wb') 并替换file.getvalue() 包含读取磁盘文件内容的代码。

Here's a simple Django view function which zips up (as an example) any readable files in /tmp and returns the zip file.

from django.http import HttpResponse
import zipfile
import os
from cStringIO import StringIO # caveats for Python 3.0 apply

def somezip(request):
    file = StringIO()
    zf = zipfile.ZipFile(file, mode='w', compression=zipfile.ZIP_DEFLATED)
    for fn in os.listdir("/tmp"):
        path = os.path.join("/tmp", fn)
        if os.path.isfile(path):
            try:
                zf.write(path)
            except IOError:
                pass
    zf.close()
    response = HttpResponse(file.getvalue(), mimetype="application/zip")
    response['Content-Disposition'] = 'attachment; filename=yourfiles.zip'
    return response

Of course this approach will only work if the zip files will conveniently fit into memory - if not, you'll have to use a disk file (which you're trying to avoid). In that case, you just replace the file = StringIO() with file = open('/path/to/yourfiles.zip', 'wb') and replace the file.getvalue() with code to read the contents of the disk file.

回复收藏 0 原文