使用 python zipfile 从 zip 中提取文件而不保留顶级文件夹
我正在使用当前代码从 zip 文件中提取文件,同时保留目录结构:
zip_file = zipfile.ZipFile('archive.zip', 'r')
zip_file.extractall('/dir/to/extract/files/')
zip_file.close()
这是示例 zip 文件的结构:
/dir1/file.jpg
/dir1/file1.jpg
/dir1/file2.jpg
最后我想要这个:
/dir/to/extract/file.jpg
/dir/to/extract/file1.jpg
/dir/to/extract/file2.jpg
但只有当 zip 文件具有顶部时才应该忽略-level 文件夹,其中包含所有文件,因此当我提取具有此结构的 zip 时:
/dir1/file.jpg
/dir1/file1.jpg
/dir1/file2.jpg
/dir2/file.txt
/file.mp3
它应该保持这样:
/dir/to/extract/dir1/file.jpg
/dir/to/extract/dir1/file1.jpg
/dir/to/extract/dir1/file2.jpg
/dir/to/extract/dir2/file.txt
/dir/to/extract/file.mp3
有什么想法吗?
I'm using the current code to extract the files from a zip file while keeping the directory structure:
zip_file = zipfile.ZipFile('archive.zip', 'r')
zip_file.extractall('/dir/to/extract/files/')
zip_file.close()
Here is a structure for an example zip file:
/dir1/file.jpg
/dir1/file1.jpg
/dir1/file2.jpg
At the end I want this:
/dir/to/extract/file.jpg
/dir/to/extract/file1.jpg
/dir/to/extract/file2.jpg
But it should ignore only if the zip file has a top-level folder with all files inside it, so when I extract a zip with this structure:
/dir1/file.jpg
/dir1/file1.jpg
/dir1/file2.jpg
/dir2/file.txt
/file.mp3
It should stay like this:
/dir/to/extract/dir1/file.jpg
/dir/to/extract/dir1/file1.jpg
/dir/to/extract/dir1/file2.jpg
/dir/to/extract/dir2/file.txt
/dir/to/extract/file.mp3
Any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果我正确理解您的问题,您希望在提取 zip 中的项目之前删除所有常见的前缀目录。
如果是这样,那么以下脚本应该执行您想要的操作:
If I understand your question correctly, you want to strip any common prefix directories from the items in the zip before extracting them.
If so, then the following script should do what you want:
读取
ZipFile.namelist()
返回的条目以查看它们是否位于同一目录中,然后打开/读取每个条目并将其写入使用open()< 打开的文件/代码>。
Read the entries returned by
ZipFile.namelist()
to see if they're in the same directory, and then open/read each entry and write it to a file opened withopen()
.这可能是 zip 存档本身的问题。在 python 提示符下尝试执行此操作以查看文件是否位于 zip 文件本身的正确目录中。
这应该是“dir1”之类的内容
重复上述步骤,将 1 的索引替换为文件列表,如下所示
first_file = zf.filelist[1]
这次输出应类似于“dir1/file1.jpg”,如果情况并非如此,则zip 文件不包含目录,将全部解压缩到一个目录。This might be a problem with the zip archive itself. In a python prompt try this to see if the files are in the correct directories in the zip file itself.
This should say something like "dir1"
repeat the steps above substituting and index of 1 into filelist like so
first_file = zf.filelist[1]
This time the output should look like 'dir1/file1.jpg' if this is not the case then the zip file does not contain directories and will be unzipped all to one single directory.根据@ekhumoro的回答,我想出了一个更简单的函数来提取同一级别上的所有内容,这并不完全是您所要求的,但我认为可以帮助某人。
Based on the @ekhumoro's answer I come up with a simpler funciton to extract everything on the same level, it is not exactly what you are asking but I think can help someone.
基本上您需要做两件事:
以下内容应保留 zip 的整体结构,同时删除根目录:
Basically you need to do two things:
The following should retain the overall structure of the zip while removing the root directory: