使用 python ZipFile 从 zip 中提取文件而不保留结构?

发布于 2024-10-16 10:26:14 字数 781 浏览 6 评论 0原文

我尝试从 .zip 中提取一个文件夹中包含子文件夹的所有文件。我希望将子文件夹中的所有文件仅提取到一个文件夹中,而不保留原始结构。目前,我提取所有文件,将文件移动到一个文件夹,然后删除以前的子文件夹。相同名称的文件将被覆盖。

是否可以在写入文件之前执行此操作?

例如,这是一个结构:

my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt

最后我希望:

my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt

我可以在这段代码中添加什么?

import zipfile
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    zip_file.extract(files, my_dir)
zip_file.close()

如果我从 zip_file.namelist() 重命名文件路径,则会出现以下错误:

KeyError: "There is no item named 'file2.txt' in the archive"

I try to extract all files from .zip containing subfolders in one folder. I want all the files from subfolders extract in only one folder without keeping the original structure. At the moment, I extract all, move the files to a folder, then remove previous subfolders. The files with same names are overwrited.

Is it possible to do it before writing files?

Here is a structure for example:

my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt

At the end I whish this:

my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt

What can I add to this code ?

import zipfile
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    zip_file.extract(files, my_dir)
zip_file.close()

if I rename files path from zip_file.namelist(), I have this error:

KeyError: "There is no item named 'file2.txt' in the archive"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

╄→承喏 2024-10-23 10:26:14

这将打开 zip 存档成员的文件句柄,提取文件名并将其复制到目标文件(这就是 ZipFile.extract 的工作原理,无需处理子目录)。

import os
import shutil
import zipfile

my_dir = r"D:\Download"
my_zip = r"D:\Download\my_file.zip"

with zipfile.ZipFile(my_zip) as zip_file:
    for member in zip_file.namelist():
        filename = os.path.basename(member)
        # skip directories
        if not filename:
            continue
    
        # copy file (taken from zipfile's extract)
        source = zip_file.open(member)
        target = open(os.path.join(my_dir, filename), "wb")
        with source, target:
            shutil.copyfileobj(source, target)

This opens file handles of members of the zip archive, extracts the filename and copies it to a target file (that's how ZipFile.extract works, without taking care of subdirectories).

import os
import shutil
import zipfile

my_dir = r"D:\Download"
my_zip = r"D:\Download\my_file.zip"

with zipfile.ZipFile(my_zip) as zip_file:
    for member in zip_file.namelist():
        filename = os.path.basename(member)
        # skip directories
        if not filename:
            continue
    
        # copy file (taken from zipfile's extract)
        source = zip_file.open(member)
        target = open(os.path.join(my_dir, filename), "wb")
        with source, target:
            shutil.copyfileobj(source, target)
叹梦 2024-10-23 10:26:14

可以迭代ZipFile.infolist()。在返回的 ZipInfo 对象上,您可以操作文件名来删除目录部分,最后将其提取到指定目录。

import zipfile
import os

my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

with zipfile.ZipFile(my_zip) as zip:
    for zip_info in zip.infolist():
        if zip_info.is_dir():
            continue
        zip_info.filename = os.path.basename(zip_info.filename)
        zip.extract(zip_info, my_dir)

It is possible to iterate over the ZipFile.infolist(). On the returned ZipInfo objects you can then manipulate the filename to remove the directory part and finally extract it to a specified directory.

import zipfile
import os

my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

with zipfile.ZipFile(my_zip) as zip:
    for zip_info in zip.infolist():
        if zip_info.is_dir():
            continue
        zip_info.filename = os.path.basename(zip_info.filename)
        zip.extract(zip_info, my_dir)
他是夢罘是命 2024-10-23 10:26:14

只需提取内存中的字节,计算文件名,然后自己将其写入其中,
而不是让库来做 - 大多数情况下,只需使用“read()”而不是“extract()”方法:

Python 3.6+ update(2020) - 与原始答案相同的代码,但使用pathlib.Path,它可以简化文件路径操作和其他操作(如“write_bytes”)

from pathlib import Path
import zipfile
import os

my_dir = Path("D:\\Download\\")
my_zip = my_dir / "my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    myfile_path = my_dir / Path(files.filename).name
    myfile_path.write_bytes(data)
zip_file.close()

答案中的原始代码,无需pathlib:

import zipfile
import os

my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    # I am almost shure zip represents directory separator
    # char as "/" regardless of OS, but I  don't have DOS or Windos here to test it
    myfile_path = os.path.join(my_dir, files.split("/")[-1])
    myfile = open(myfile_path, "wb")
    myfile.write(data)
    myfile.close()
zip_file.close()

Just extract to bytes in memory,compute the filename, and write it there yourself,
instead of letting the library do it - -mostly, just use the "read()" instead of "extract()" method:

Python 3.6+ update(2020) - the same code from the original answer, but using pathlib.Path, which ease file-path manipulation and other operations (like "write_bytes")

from pathlib import Path
import zipfile
import os

my_dir = Path("D:\\Download\\")
my_zip = my_dir / "my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    myfile_path = my_dir / Path(files.filename).name
    myfile_path.write_bytes(data)
zip_file.close()

Original code in answer without pathlib:

import zipfile
import os

my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    # I am almost shure zip represents directory separator
    # char as "/" regardless of OS, but I  don't have DOS or Windos here to test it
    myfile_path = os.path.join(my_dir, files.split("/")[-1])
    myfile = open(myfile_path, "wb")
    myfile.write(data)
    myfile.close()
zip_file.close()
友谊不毕业 2024-10-23 10:26:14

Gerhard Götz 的解决方案类似的概念,但适用于提取单个文件而不是整个 zip:

with ZipFile(zipPath, 'r') as zipObj:
    zipInfo = zipObj.getinfo(path_in_zip))
    zipInfo.filename = os.path.basename(destination)
    zipObj.extract(zipInfo, os.path.dirname(os.path.realpath(destination)))

A similar concept to the solution of Gerhard Götz, but adapted for extracting single files instead of the entire zip:

with ZipFile(zipPath, 'r') as zipObj:
    zipInfo = zipObj.getinfo(path_in_zip))
    zipInfo.filename = os.path.basename(destination)
    zipObj.extract(zipInfo, os.path.dirname(os.path.realpath(destination)))
南薇 2024-10-23 10:26:14

如果您遇到 badZipFile 错误。您可以使用 7zip 子进程解压缩存档。假设您已经安装了 7zip,然后使用以下代码。

import subprocess
my_dir = destFolder #destination folder
my_zip = destFolder + "/" + filename.zip #file you want to extract
ziploc = "C:/Program Files/7-Zip/7z.exe" #location where 7zip is installed
cmd = [ziploc, 'e',my_zip ,'-o'+ my_dir ,'*.txt' ,'-r' ] 
#extracting only txt files and from all subdirectories
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)

In case you are getting badZipFile error. you can unzip the archive using 7zip sub process. assuming you have installed the 7zip then use the following code.

import subprocess
my_dir = destFolder #destination folder
my_zip = destFolder + "/" + filename.zip #file you want to extract
ziploc = "C:/Program Files/7-Zip/7z.exe" #location where 7zip is installed
cmd = [ziploc, 'e',my_zip ,'-o'+ my_dir ,'*.txt' ,'-r' ] 
#extracting only txt files and from all subdirectories
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文