当前位置：文江博客话题详情

使用Python提取ZipFile，显示进度百分比？

发布于 2024-10-05 08:32:49 字数 48 浏览 1 评论 0原文

我知道如何使用 Python 提取 zip 存档，但如何以百分比形式显示提取进度？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

兔姬 2024-10-12 08:32:49

我建议使用 tqdm，您可以使用 pip 安装它，如下所示：

pip install tqdm

然后，您可以直接使用它，如下所示：

>>> from tqdm import tqdm
>>>
>>> with zipfile.ZipFile(some_source) as zf:
...     for member in tqdm(zf.infolist(), desc='Extracting '):
...         try:
...             zf.extract(member, target_path)
...         except zipfile.error as e:
...             pass

这将产生如下所示的内容：

Extracting : 100%|██████████| 60.0k/60.0k [14:56<00:00, 66.9File/s]

I suggest using tqdm, you can install it using pip like so:

pip install tqdm

Then, you can use it directly like so:

>>> from tqdm import tqdm
>>>
>>> with zipfile.ZipFile(some_source) as zf:
...     for member in tqdm(zf.infolist(), desc='Extracting '):
...         try:
...             zf.extract(member, target_path)
...         except zipfile.error as e:
...             pass

This will produce something like so:

Extracting : 100%|██████████| 60.0k/60.0k [14:56<00:00, 66.9File/s]

回复收藏 0 原文

笑着哭最痛 2024-10-12 08:32:49

extract 方法不提供对此的回调，因此必须使用 getinfo 来获取 e 未压缩的大小，然后打开从块中读取的文件并将其写入您想要的位置要更新百分比的文件还必须恢复 mtime（如果需要）示例：

import zipfile
z = zipfile.ZipFile(some_source)
entry_info = z.getinfo(entry_name)
i = z.open(entry_name)
o = open(target_name, 'w')
offset = 0
while True:
    b = i.read(block_size)
    offset += len(b)
    set_percentage(float(offset)/float(entry_info.file_size) * 100.)
    if b == '':
        break
    o.write(b)
i.close()
o.close()
set_attributes_from(entry_info)

这会将 entry_name 提取到 target_name

大部分也是由 < code>shutil.copyfileobj 但它没有回调进度，或者

ZipFile.extract 方法调用的源 _extract_member 使用：

source = self.open(member, pwd=pwd)
target = file(targetpath, "wb")
shutil.copyfileobj(source, target)
source.close()
target.close()

where member has如果它不是 ZipInfo 对象，则通过 getinfo(member) 将名称从名称转换为 ZipInfo 对象

the extract method doesn't provide a call back for this so one would have to use getinfo to get the e uncompressed size and then open the file read from it in blocks and write it to the place you want the file to go and update the percentage one would also have to restore the mtime if that is wanted an example:

import zipfile
z = zipfile.ZipFile(some_source)
entry_info = z.getinfo(entry_name)
i = z.open(entry_name)
o = open(target_name, 'w')
offset = 0
while True:
    b = i.read(block_size)
    offset += len(b)
    set_percentage(float(offset)/float(entry_info.file_size) * 100.)
    if b == '':
        break
    o.write(b)
i.close()
o.close()
set_attributes_from(entry_info)

this extracts entry_name to target_name

most of this is also done by shutil.copyfileobj but it doesn't have a call back for progress either

the source of the ZipFile.extract method calls _extract_member uses:

source = self.open(member, pwd=pwd)
target = file(targetpath, "wb")
shutil.copyfileobj(source, target)
source.close()
target.close()

where member has be converted from a name to a ZipInfo object by getinfo(member) if it wasn't a ZipInfo object

回复收藏 0 原文

心安伴我暖 2024-10-12 08:32:49

抱歉有点晚才看到这个。有类似的问题，需要与 zipfile.Zipfile.extractall 等效的文件。如果您有 tqdm>=4.40.0 （我一年多前发布的），那么：

from os import fspath
from pathlib import Path
from shutil import copyfileobj
from zipfile import ZipFile
from tqdm.auto import tqdm  # could use from tqdm.gui import tqdm
from tqdm.utils import CallbackIOWrapper

def extractall(fzip, dest, desc="Extracting"):
    """zipfile.Zipfile(fzip).extractall(dest) with progress"""
    dest = Path(dest).expanduser()
    with ZipFile(fzip) as zipf, tqdm(
        desc=desc, unit="B", unit_scale=True, unit_divisor=1024,
        total=sum(getattr(i, "file_size", 0) for i in zipf.infolist()),
    ) as pbar:
        for i in zipf.infolist():
            if not getattr(i, "file_size", 0):  # directory
                zipf.extract(i, fspath(dest))
            else:
                with zipf.open(i) as fi, open(fspath(dest / i.filename), "wb") as fo:
                    copyfileobj(CallbackIOWrapper(pbar.update, fi), fo)

Sorry a bit late seeing this. Had a similar problem, needing an equivalent to zipfile.Zipfile.extractall. If you have tqdm>=4.40.0 (which I released over a year ago), then:

from os import fspath
from pathlib import Path
from shutil import copyfileobj
from zipfile import ZipFile
from tqdm.auto import tqdm  # could use from tqdm.gui import tqdm
from tqdm.utils import CallbackIOWrapper

def extractall(fzip, dest, desc="Extracting"):
    """zipfile.Zipfile(fzip).extractall(dest) with progress"""
    dest = Path(dest).expanduser()
    with ZipFile(fzip) as zipf, tqdm(
        desc=desc, unit="B", unit_scale=True, unit_divisor=1024,
        total=sum(getattr(i, "file_size", 0) for i in zipf.infolist()),
    ) as pbar:
        for i in zipf.infolist():
            if not getattr(i, "file_size", 0):  # directory
                zipf.extract(i, fspath(dest))
            else:
                with zipf.open(i) as fi, open(fspath(dest / i.filename), "wb") as fo:
                    copyfileobj(CallbackIOWrapper(pbar.update, fi), fo)

回复收藏 0 原文

一身软味 2024-10-12 08:32:49

对于懒人来说，下面是一个基于 Dan D 的回答的独立工作示例。在 Python 3.10.6 上测试。未优化，但有效。

在此示例中，假设目标“test”目录存在，但您当然可以在提取函数中创建它。

与我在本主题中看到的大多数答案相比，丹的答案的优点是，如果存档由非常大的文件组成，则每次处理存档中的文件时显示进度并不能达到目标。

import zipfile
import os
from pathlib import Path

def extract(zip_path, target_path):
    block_size = 8192
    z = zipfile.ZipFile(zip_path)
    for entry_name in z.namelist():
        entry_info = z.getinfo(entry_name)
        i = z.open(entry_name)
        print(entry_name)
        if entry_name[-1] != '/':
            dir_name = os.path.dirname(entry_name)
            p = Path(f"{target_path}/{dir_name}")
            p.mkdir(parents=True, exist_ok=True)
            o = open(f"{target_path}/{entry_name}", 'wb')
            offset = 0
            while True:
                b = i.read(block_size)
                offset += len(b)
                print(float(offset)/float(entry_info.file_size) * 100.)
                if b == b'':
                    break
                o.write(b)
            o.close()
        i.close()
    z.close()

extract("test.zip", "test")

For the lazy, below is a self-contained working example based on Dan D's answer. Tested on Python 3.10.6. Not optimized, but works.

In this example, the assumption is that the target "test" directory exists, but you can of course create it in the extract function.

The advantage of Dan's answer over most of the answers I've seen for this topic is that showing progress each time a file from the archive is processed does not achieve the goal if the archive consists of very large files.

import zipfile
import os
from pathlib import Path

def extract(zip_path, target_path):
    block_size = 8192
    z = zipfile.ZipFile(zip_path)
    for entry_name in z.namelist():
        entry_info = z.getinfo(entry_name)
        i = z.open(entry_name)
        print(entry_name)
        if entry_name[-1] != '/':
            dir_name = os.path.dirname(entry_name)
            p = Path(f"{target_path}/{dir_name}")
            p.mkdir(parents=True, exist_ok=True)
            o = open(f"{target_path}/{entry_name}", 'wb')
            offset = 0
            while True:
                b = i.read(block_size)
                offset += len(b)
                print(float(offset)/float(entry_info.file_size) * 100.)
                if b == b'':
                    break
                o.write(b)
            o.close()
        i.close()
    z.close()

extract("test.zip", "test")

回复收藏 0 原文

愚人国度 2024-10-12 08:32:49

import zipfile
srcZipFile = 'srcZipFile.zip'
distZipFile = 'distZipFile'
with zipfile.ZipFile(srcZipFile) as zf:
    filesList = zf.namelist()
    for idx, file in enumerate(filesList):
        percent = round((idx / len(filesList))*100)
        print(percent)
        zf.extract(file, distZipFile)
    zf.close()

import zipfile
srcZipFile = 'srcZipFile.zip'
distZipFile = 'distZipFile'
with zipfile.ZipFile(srcZipFile) as zf:
    filesList = zf.namelist()
    for idx, file in enumerate(filesList):
        percent = round((idx / len(filesList))*100)
        print(percent)
        zf.extract(file, distZipFile)
    zf.close()

回复收藏 0 原文

~没有更多了~