如何使用 Python 将文件添加到 tarfile,而不添加目录层次结构?

发布于 2024-08-21 04:44:19 字数 193 浏览 7 评论 0原文

当我在具有文件路径的 tarfile 对象上调用 add() 时,该文件将添加到与目录层次结构关联的 tarball 中。换句话说,如果我解压缩 tar 文件,则会复制原始目录层次结构中的目录。

有没有一种方法可以简单地添加一个没有目录信息的普通文件,以便解压生成的 tarball 生成一个平面文件列表?

When I invoke add() on a tarfile object with a file path, the file is added to the tarball with directory hierarchy associated. In other words, if I unzip the tarfile the directories in the original directories hierarchy are reproduced.

Is there a way to simply add a plain file, without directory info, so that untarring the resulting tarball produces a flat list of files?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

一梦等七年七年为一梦 2024-08-28 04:44:19

使用 TarFile.add() 方法的 arcname 参数是匹配目标的另一种便捷方法。

示例:您想要将目录 repo/a.git/ 存档到 tar.gz 文件,但您希望存档中的树根以 开头a.git/ 但不是 repo/a.git/,您可以执行以下操作:

archive = tarfile.open("a.git.tar.gz", "w|gz")
archive.add("repo/a.git", arcname="a.git")
archive.close()

Using the arcname argument of TarFile.add() method is an alternate and convenient way to match your destination.

Example: you want to archive a dir repo/a.git/ to a tar.gz file, but you rather want the tree root in the archive begins by a.git/ but not repo/a.git/, you can do like followings:

archive = tarfile.open("a.git.tar.gz", "w|gz")
archive.add("repo/a.git", arcname="a.git")
archive.close()
橘虞初梦 2024-08-28 04:44:19

您可以使用 tarfile.addfile() ,在 TarInfo 对象中,这是对于第一个参数,您可以指定与要添加的文件不同的名称

这段代码应将 /path/to/filename 添加到 TAR 文件,但会将其提取为 myfilename

tar.addfile(tarfile.TarInfo("myfilename.txt"), open("/path/to/filename.txt"))

You can use tarfile.addfile(), in the TarInfo object, which is the first parameter, you can specify a name that's different from the file you're adding.

This piece of code should add /path/to/filename to the TAR file but will extract it as myfilename:

tar.addfile(tarfile.TarInfo("myfilename.txt"), open("/path/to/filename.txt"))
∞觅青森が 2024-08-28 04:44:19

也许您可以使用 TarFile.add(name, arcname) 的“arcname”参数。它采用文件在存档中将具有的备用名称。

Maybe you can use the "arcname" argument to TarFile.add(name, arcname). It takes an alternate name that the file will have inside the archive.

始终不够 2024-08-28 04:44:19

感谢@diabloneo,创建目录的选择性 tarball 的功能

def compress(output_file="archive.tar.gz", output_dir='', root_dir='.', items=[]):
    """compress dirs.

    KWArgs
    ------
    output_file : str, default ="archive.tar.gz"
    output_dir : str, default = ''
        absolute path to output
    root_dir='.',
        absolute path to input root dir
    items : list
        list of dirs/items relative to root dir

    """
    os.chdir(root_dir)
    with tarfile.open(os.path.join(output_dir, output_file), "w:gz") as tar:
        for item in items:
            tar.add(item, arcname=item)    


>>>root_dir = "/abs/pth/to/dir/"
>>>compress(output_file="archive.tar.gz", output_dir=root_dir, 
            root_dir=root_dir, items=["logs", "output"])

thanks to @diabloneo, function to create selective tarball of a dir

def compress(output_file="archive.tar.gz", output_dir='', root_dir='.', items=[]):
    """compress dirs.

    KWArgs
    ------
    output_file : str, default ="archive.tar.gz"
    output_dir : str, default = ''
        absolute path to output
    root_dir='.',
        absolute path to input root dir
    items : list
        list of dirs/items relative to root dir

    """
    os.chdir(root_dir)
    with tarfile.open(os.path.join(output_dir, output_file), "w:gz") as tar:
        for item in items:
            tar.add(item, arcname=item)    


>>>root_dir = "/abs/pth/to/dir/"
>>>compress(output_file="archive.tar.gz", output_dir=root_dir, 
            root_dir=root_dir, items=["logs", "output"])
孤者何惧 2024-08-28 04:44:19

以下是在不添加文件夹的情况下压缩 folder 中的文件列表的代码示例:

    with tarfile.open(tar_path, 'w') as tar:
        for filename in os.listdir(folder):
            fpath = os.path.join(folder, filename)
            tar.add(fpath, arcname=filename)

Here is the code sample to tar list of files in folder without adding folder:

    with tarfile.open(tar_path, 'w') as tar:
        for filename in os.listdir(folder):
            fpath = os.path.join(folder, filename)
            tar.add(fpath, arcname=filename)
執念 2024-08-28 04:44:19

我一直在寻找类似的问题,但被重定向到此页面,因此我可能会为其他谷歌用户添加此问题。
就我而言,我想要一个 tar 文件,其中仅包含相对文件名,这将递归地工作。因此,zip 中的可压缩目录

/home/test/data
/home/test/data/content.txt
/home/test/data/files/file.txt

如下所示:

content.txt
files/file.txt

默认情况下,python tarfile 将添加 / 作为额外条目。

我的目标是删除 tar 文件中的前导 / 条目,因为它被视为 ZipSlip 漏洞

当使用带有此类漏洞的 tar 时,您会收到一条警告

tar: Removing leading `/' from member names

我不确定为什么 python tarfile 库没有简单的方法来处理这个问题,但我想到了这段代码完全符合我的要求:

def package_tar_recursive_without_root_folder(input_dir: str, output_file: str):
    with tarfile.open(output_file, mode='w:gz') as archive:
        for root, dirs, files in os.walk(input_dir):
            for file in files:
                file_path = os.path.join(root, file)
                relative_path = os.path.relpath(file_path, input_dir)
                archive.add(file_path, arcname=relative_path, recursive=False)

I was looking for similar question but got redirected to this page so I might add this for further fellow googlers.
In my case, I want to have a tar file with only relative file names inside of it, which would work recursively. So, a zippable directory of

/home/test/data
/home/test/data/content.txt
/home/test/data/files/file.txt

in zip would look like this:

content.txt
files/file.txt

By default, python tarfile will add / as extra entry.

My goal was to remove leading / entry in tar file, since it is considered an ZipSlip vulnerability

When using tar with such vulnerability you will get an warning

tar: Removing leading `/' from member names

I'm not sure why python tarfile library does not have easy way to handle this, but I came up with this code that does exactly what I want:

def package_tar_recursive_without_root_folder(input_dir: str, output_file: str):
    with tarfile.open(output_file, mode='w:gz') as archive:
        for root, dirs, files in os.walk(input_dir):
            for file in files:
                file_path = os.path.join(root, file)
                relative_path = os.path.relpath(file_path, input_dir)
                archive.add(file_path, arcname=relative_path, recursive=False)
刘备忘录 2024-08-28 04:44:19

如果您想在 tarfile 中添加目录名称而不是其内容,可以执行以下操作:

(1) 创建一个名为 empty 的空目录
(2) tf.add("empty", arcname=path_you_want_to_add)

这将创建一个名为 path_you_want_to_add 的空目录。

If you want to add the directory name but not its contents inside a tarfile, you can do the following:

(1) create an empty directory called empty
(2) tf.add("empty", arcname=path_you_want_to_add)

That creates an empty directory with the name path_you_want_to_add.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文