Python 内存 zip 库
是否有一个 Python 库允许在内存中操作 zip 存档,而无需使用实际的磁盘文件?
ZipFile 库不允许您更新存档。唯一的方法似乎是将其提取到一个目录,进行更改,然后从该目录创建一个新的 zip。我想在没有磁盘访问的情况下修改 zip 存档,因为我将下载它们,进行更改,然后再次上传它们,所以我没有理由存储它们。
类似于 Java 的 ZipInputStream/ZipOutputStream 的东西就可以解决这个问题,尽管任何避免磁盘访问的接口都可以。
Is there a Python library that allows manipulation of zip archives in memory, without having to use actual disk files?
The ZipFile library does not allow you to update the archive. The only way seems to be to extract it to a directory, make your changes, and create a new zip from that directory. I want to modify zip archives without disk access, because I'll be downloading them, making changes, and uploading them again, so I have no reason to store them.
Something similar to Java's ZipInputStream/ZipOutputStream would do the trick, although any interface at all that avoids disk access would be fine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
Python 3
PYTHON 3
根据 Python 文档:
因此,要在内存中打开文件,只需创建一个类似文件的对象(也许使用 BytesIO)。
According to the Python docs:
So, to open the file in memory, just create a file-like object (perhaps using BytesIO).
从文章 Python 中的内存中 Zip:
From the article In-Memory Zip in Python:
Ethier 提供的示例有几个问题,其中一些问题很严重:
InMemoryZip
属性,如果您安装
ruamel.std.zipfile
(我是该文件的作者),则可以使用更新版本。 之后在 这里,您可以执行以下操作:
您也可以使用
imz.data
将内容写入您需要的任何位置。您还可以使用
with
语句,如果您提供文件名,ZIP 的内容将在离开该上下文时写入:由于延迟写入光盘,您实际上可以从旧的文件中读取在该上下文中的
test.zip
。The example Ethier provided has several problems, some of them major:
InMemoryZip
attributeAn updated version is available if you install
ruamel.std.zipfile
(of which I am the author). Afteror including the code for the class from here, you can do:
You can alternatively write the contents using
imz.data
to any place you need.You can also use the
with
statement, and if you provide a filename, the contents of the ZIP will be written on leaving that context:because of the delayed writing to disc, you can actually read from an old
test.zip
within that context.我正在使用 Flask 创建一个内存中的 zip 文件并将其作为下载返回。基于弗拉基米尔上面的示例。
seek(0)
花了一段时间才弄清楚。I am using Flask to create an in-memory zipfile and return it as a download. Builds on the example above from Vladimir. The
seek(0)
took a while to figure out.帮助程序根据
{'1.txt': 'string', '2.txt": b'bytes'}
等数据创建包含多个文件的内存 zip 文件Helper to create in-memory zip file with multiple files based on data like
{'1.txt': 'string', '2.txt": b'bytes'}
这可以使用两个库 https://github.com/uktrade/stream-unzip 和 https://github.com/uktrade/stream-zip (完整披露:由我编写)。根据更改的情况,您甚至可能不必立即将整个 zip 存储在内存中。
假设您只想下载、解压缩、压缩并重新上传。有点毫无意义,但您可以对解压缩的内容进行一些更改:
This is possible using the two libraries https://github.com/uktrade/stream-unzip and https://github.com/uktrade/stream-zip (full disclosure: written by me). And depending on the changes, you might not even have to store the entire zip in memory at once.
Say you just want to download, unzip, zip, and re-upload. Slightly pointless, but you could slot in some changes to the unzipped content:
您可以通过 ctypes 在 Python 中使用库 libarchive - 它提供了在内存中操作 ZIP 数据的方法,专注于流媒体(至少历史上如此)。
假设我们想要在从 HTTP 服务器下载时即时解压缩 ZIP 文件。 可以使用下面的代码
来执行此操作。
事实上,由于 libarchive 支持多种存档格式,并且上面没有任何内容是特定于 ZIP 的,因此它很可能适用于其他格式。
You can use the library libarchive in Python through ctypes - it offers ways of manipulating ZIP data in memory, with a focus on streaming (at least historically).
Say we want to uncompress ZIP files on the fly while downloading from an HTTP server. The below code
can be used as follows to do that
In fact since libarchive supports multiple archive formats, and nothing above is particularly ZIP-specific, it may well work with other formats.
需要注意的是,如果要在 Python 之外使用新创建的内存中 Zip 存档,例如将其保存到本地磁盘,或通过 POST 请求发送,则需要将中央目录记录的末尾写入它;否则,它不会被识别为有效的 ZIP 文件。
这看起来像(对于 Python 3.11)
It's important to note that if you want to use the newly created in-memory Zip archive outside of Python, such as saving it to a local disk, or sent through a POST request, it needs to have the end of central directory records written to it; otherwise, it won't be recognized as a valid ZIP file.
This would look like (for Python 3.11)