Python tarfile 进度输出?

发布于 2024-09-18 14:02:22 字数 228 浏览 9 评论 0原文

我正在使用以下代码来提取 tar 文件:

import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()

但是,我想以当前正在提取哪些文件的形式密切关注进度。我该怎么做?

额外奖励积分:是否也可以创建提取过程的百分比?我想用 tkinter 来更新进度条。谢谢!

I'm using the following code to extract a tar file:

import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()

However, I'd like to keep tabs on the progress in the form of which files are being extracted at the moment. How can I do this?

EXTRA BONUS POINTS: is it possible to create a percentage of the extraction process as well? I'd like to use that for tkinter to update a progress bar. Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

很糊涂小朋友 2024-09-25 14:02:22

文件进度和全局进度:

import io
import os
import tarfile

def get_file_progress_file_object_class(on_progress):
    class FileProgressFileObject(tarfile.ExFileObject):
        def read(self, size, *args):
            on_progress(self.name, self.position, self.size)
            return tarfile.ExFileObject.read(self, size, *args)
    return FileProgressFileObject

class TestFileProgressFileObject(tarfile.ExFileObject):
    def read(self, size, *args):
        on_progress(self.name, self.position, self.size)
        return tarfile.ExFileObject.read(self, size, *args)

class ProgressFileObject(io.FileIO):
    def __init__(self, path, *args, **kwargs):
        self._total_size = os.path.getsize(path)
        io.FileIO.__init__(self, path, *args, **kwargs)

    def read(self, size):
        print("Overall process: %d of %d" %(self.tell(), self._total_size))
        return io.FileIO.read(self, size)

def on_progress(filename, position, total_size):
    print("%s: %d of %s" %(filename, position, total_size))

tarfile.TarFile.fileobject = get_file_progress_file_object_class(on_progress)
tar = tarfile.open(fileobj=ProgressFileObject("a.tgz"))
tar.extractall()
tar.close()

Both file-progress and global progress:

import io
import os
import tarfile

def get_file_progress_file_object_class(on_progress):
    class FileProgressFileObject(tarfile.ExFileObject):
        def read(self, size, *args):
            on_progress(self.name, self.position, self.size)
            return tarfile.ExFileObject.read(self, size, *args)
    return FileProgressFileObject

class TestFileProgressFileObject(tarfile.ExFileObject):
    def read(self, size, *args):
        on_progress(self.name, self.position, self.size)
        return tarfile.ExFileObject.read(self, size, *args)

class ProgressFileObject(io.FileIO):
    def __init__(self, path, *args, **kwargs):
        self._total_size = os.path.getsize(path)
        io.FileIO.__init__(self, path, *args, **kwargs)

    def read(self, size):
        print("Overall process: %d of %d" %(self.tell(), self._total_size))
        return io.FileIO.read(self, size)

def on_progress(filename, position, total_size):
    print("%s: %d of %s" %(filename, position, total_size))

tarfile.TarFile.fileobject = get_file_progress_file_object_class(on_progress)
tar = tarfile.open(fileobj=ProgressFileObject("a.tgz"))
tar.extractall()
tar.close()
丘比特射中我 2024-09-25 14:02:22

您可以在extractall()中指定members参数

with tarfile.open(<path>, 'r') as tarball:
   tarball.extractall(path=<some path>, members = track_progress(tarball))

def track_progress(members):
   for member in members:
      # this will be the current file being extracted
      yield member

TarInfo对象,查看所有可用的函数和属性< a href="http://docs.python.org/2/library/tarfile.html#tarinfo-objects" rel="noreferrer">此处

You can specify the members parameter in extractall()

with tarfile.open(<path>, 'r') as tarball:
   tarball.extractall(path=<some path>, members = track_progress(tarball))

def track_progress(members):
   for member in members:
      # this will be the current file being extracted
      yield member

member are TarInfo objects, see all available functions and properties here

妳是的陽光 2024-09-25 14:02:22

您只需使用 tqdm() 并打印文件数量的进度提取:

import tarfile
from tqdm import tqdm

# open your tar.gz file
with tarfile.open(name=path) as tar:

    # Go over each member
    for member in tqdm(iterable=tar.getmembers(), total=len(tar.getmembers())):

        # Extract member
        tar.extract(member=member)

You can just use tqdm() and print the progress of the number of files being extracted:

import tarfile
from tqdm import tqdm

# open your tar.gz file
with tarfile.open(name=path) as tar:

    # Go over each member
    for member in tqdm(iterable=tar.getmembers(), total=len(tar.getmembers())):

        # Extract member
        tar.extract(member=member)
南城追梦 2024-09-25 14:02:22

您可以使用 extract 代替extractall - 您可以在提取成员名称时打印它们。要获取成员列表,您可以使用 getmembers

可以在此处找到文本进度条库:

Tkinter 片段:

You could use extract instead of extractall - you would be able to print the member names as they are being extracted. To get a list of members, you could use getmembers.

A textual progressbar library can be found here:

Tkinter snippet:

南街九尾狐 2024-09-25 14:02:22

这里有一个很酷的解决方案,它可以覆盖 tarfile 模块作为直接替换,并允许您指定要更新的回调。

https://github.com/thomaspurchas/tarfile-Progress-Reporter/

根据评论更新

There's a cool solution here that overrides the tarfile module as a drop-in replacement and lets you specify a callback to update.

https://github.com/thomaspurchas/tarfile-Progress-Reporter/

updated based on comment

舟遥客 2024-09-25 14:02:22

要查看当前正在提取哪个文件,以下内容对我有用:

import tarfile

print "Extracting the contents of sample.tar.gz:"
tar = tarfile.open("sample.tar.gz")

for member_info in tar.getmembers():
    print "- extracting: " + member_info.name
    tar.extract(member_info)

tar.close()

To see which file is currently being extracted, the following worked for me:

import tarfile

print "Extracting the contents of sample.tar.gz:"
tar = tarfile.open("sample.tar.gz")

for member_info in tar.getmembers():
    print "- extracting: " + member_info.name
    tar.extract(member_info)

tar.close()
等往事风中吹 2024-09-25 14:02:22

这就是我使用的,无需猴子修补或需要条目数。

def iter_tar_files(f):
    total_bytes = os.stat(f).st_size
    with open(f, "rb") as file_obj,\
        tarfile.open(fileobj=file_obj, mode="r:gz") as tar:
        for member in tar.getmembers():
            f = tar.extractfile(member)
            if f is not None:
                content = f.read()
                yield member.path, content
            # This prints something like: 512/1024 = 50.00%
            print(f"{file_obj.tell()} / {total_bytes} = {file_obj.tell()/total_bytes*100:.2f}%")

This is what I use, without monkey patching or needing the number of entries.

def iter_tar_files(f):
    total_bytes = os.stat(f).st_size
    with open(f, "rb") as file_obj,\
        tarfile.open(fileobj=file_obj, mode="r:gz") as tar:
        for member in tar.getmembers():
            f = tar.extractfile(member)
            if f is not None:
                content = f.read()
                yield member.path, content
            # This prints something like: 512/1024 = 50.00%
            print(f"{file_obj.tell()} / {total_bytes} = {file_obj.tell()/total_bytes*100:.2f}%")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文