使用 python ZipFile 从 zip 中提取文件而不保留结构?
我尝试从 .zip 中提取一个文件夹中包含子文件夹的所有文件。我希望将子文件夹中的所有文件仅提取到一个文件夹中,而不保留原始结构。目前,我提取所有文件,将文件移动到一个文件夹,然后删除以前的子文件夹。相同名称的文件将被覆盖。
是否可以在写入文件之前执行此操作?
例如,这是一个结构:
my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt
最后我希望:
my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt
我可以在这段代码中添加什么?
import zipfile
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"
zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
zip_file.extract(files, my_dir)
zip_file.close()
如果我从 zip_file.namelist() 重命名文件路径,则会出现以下错误:
KeyError: "There is no item named 'file2.txt' in the archive"
I try to extract all files from .zip containing subfolders in one folder. I want all the files from subfolders extract in only one folder without keeping the original structure. At the moment, I extract all, move the files to a folder, then remove previous subfolders. The files with same names are overwrited.
Is it possible to do it before writing files?
Here is a structure for example:
my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt
At the end I whish this:
my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt
What can I add to this code ?
import zipfile
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"
zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
zip_file.extract(files, my_dir)
zip_file.close()
if I rename files path from zip_file.namelist(), I have this error:
KeyError: "There is no item named 'file2.txt' in the archive"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这将打开 zip 存档成员的文件句柄,提取文件名并将其复制到目标文件(这就是 ZipFile.extract 的工作原理,无需处理子目录)。
This opens file handles of members of the zip archive, extracts the filename and copies it to a target file (that's how
ZipFile.extract
works, without taking care of subdirectories).可以迭代
ZipFile.infolist()
。在返回的 ZipInfo 对象上,您可以操作文件名来删除目录部分,最后将其提取到指定目录。It is possible to iterate over the
ZipFile.infolist()
. On the returnedZipInfo
objects you can then manipulate thefilename
to remove the directory part and finally extract it to a specified directory.只需提取内存中的字节,计算文件名,然后自己将其写入其中,
而不是让库来做 - 大多数情况下,只需使用“read()”而不是“extract()”方法:
Python 3.6+ update(2020) - 与原始答案相同的代码,但使用
pathlib.Path
,它可以简化文件路径操作和其他操作(如“write_bytes”)答案中的原始代码,无需pathlib:
Just extract to bytes in memory,compute the filename, and write it there yourself,
instead of letting the library do it - -mostly, just use the "read()" instead of "extract()" method:
Python 3.6+ update(2020) - the same code from the original answer, but using
pathlib.Path
, which ease file-path manipulation and other operations (like "write_bytes")Original code in answer without pathlib:
与 Gerhard Götz 的解决方案类似的概念,但适用于提取单个文件而不是整个 zip:
A similar concept to the solution of Gerhard Götz, but adapted for extracting single files instead of the entire zip:
如果您遇到 badZipFile 错误。您可以使用 7zip 子进程解压缩存档。假设您已经安装了 7zip,然后使用以下代码。
In case you are getting badZipFile error. you can unzip the archive using 7zip sub process. assuming you have installed the 7zip then use the following code.