如何以编程方式仅从 Python 字符串创建嵌套目录和文件的 tar 存档,而不使用临时文件?

发布于 2024-12-23 10:34:59 字数 361 浏览 3 评论 0原文

我想从 Python 创建一个具有分层目录结构的 tar 存档,使用字符串作为文件内容。我读过这个问题,它展示了一种将字符串添加为文件的方法,但不是作为目录。如何在不实际创建目录的情况下将目录即时添加到 tar 存档中?

像这样的东西:

archive.tgz:
    file1.txt
    file2.txt
    dir1/
        file3.txt
        dir2/
            file4.txt

I want to create a tar archive with a hierarchical directory structure from Python, using strings for the contents of the files. I've read this question , which shows a way of adding strings as files, but not as directories. How can I add directories on the fly to a tar archive without actually making them?

Something like:

archive.tgz:
    file1.txt
    file2.txt
    dir1/
        file3.txt
        dir2/
            file4.txt

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

一个人的夜不怕黑 2024-12-30 10:34:59

扩展链接问题中给出的示例,您可以按如下方式执行此操作:

import tarfile
import StringIO
import time

tar = tarfile.TarFile("test.tar", "w")

string = StringIO.StringIO()
string.write("hello")
string.seek(0)

info = tarfile.TarInfo(name='dir')
info.type = tarfile.DIRTYPE
info.mode = 0755
info.mtime = time.time()
tar.addfile(tarinfo=info)

info = tarfile.TarInfo(name='dir/foo')
info.size=len(string.buf)
info.mtime = time.time()
tar.addfile(tarinfo=info, fileobj=string)

tar.close()

请小心 mode 属性,因为默认值可能不包括需要更改为该目录的所有者的执行权限,并且获取其内容。

Extending the example given in the question linked, you can do it as follows:

import tarfile
import StringIO
import time

tar = tarfile.TarFile("test.tar", "w")

string = StringIO.StringIO()
string.write("hello")
string.seek(0)

info = tarfile.TarInfo(name='dir')
info.type = tarfile.DIRTYPE
info.mode = 0755
info.mtime = time.time()
tar.addfile(tarinfo=info)

info = tarfile.TarInfo(name='dir/foo')
info.size=len(string.buf)
info.mtime = time.time()
tar.addfile(tarinfo=info, fileobj=string)

tar.close()

Be careful with mode attribute since default value might not include execute permissions for the owner of the directory which is needed to change to it and get its contents.

∞琼窗梦回ˉ 2024-12-30 10:34:59

对有用的接受的答案稍作修改,使其适用于 python 3 以及 python 2(并与OP的示例相匹配)更接近一点):

from io import BytesIO
import tarfile
import time

# create and open empty tar file
tar = tarfile.open("test.tgz", "w:gz")

# Add a file
file1_contents = BytesIO("hello 1".encode())
finfo1 = tarfile.TarInfo(name='file1.txt')
finfo1.size = len(file1_contents.getvalue())
finfo1.mtime = time.time()
tar.addfile(tarinfo=finfo1, fileobj=file1_contents)

# create directory in the tar file
dinfo = tarfile.TarInfo(name='dir')
dinfo.type = tarfile.DIRTYPE
dinfo.mode = 0o755
dinfo.mtime = time.time()
tar.addfile(tarinfo=dinfo)

# add a file to the new directory in the tar file
file2_contents = BytesIO("hello 2".encode())
finfo2 = tarfile.TarInfo(name='dir/file2.txt')
finfo2.size = len(file2_contents.getvalue())
finfo2.mtime = time.time()
tar.addfile(tarinfo=finfo2, fileobj=file2_contents)

tar.close()

特别是,我更新了八进制语法 PEP 3127 -- 整数文字支持和语法,从 io 切换到 BytesIO,使用 getvalue 而不是 buf,并使用 open 而不是 TarFile 来显示压缩输出,如示例中所示。 (上下文处理程序用法(with ... as tar:)也适用于 python2 和 python3,但剪切和粘贴不适用于我的 python2 repl,所以我没有切换它。 )在 python 2.7.15+ 和 python 3.7.3 上测试。

A slight modification to the helpful accepted answer so that it works with python 3 as well as python 2 (and matches the OP's example a bit closer):

from io import BytesIO
import tarfile
import time

# create and open empty tar file
tar = tarfile.open("test.tgz", "w:gz")

# Add a file
file1_contents = BytesIO("hello 1".encode())
finfo1 = tarfile.TarInfo(name='file1.txt')
finfo1.size = len(file1_contents.getvalue())
finfo1.mtime = time.time()
tar.addfile(tarinfo=finfo1, fileobj=file1_contents)

# create directory in the tar file
dinfo = tarfile.TarInfo(name='dir')
dinfo.type = tarfile.DIRTYPE
dinfo.mode = 0o755
dinfo.mtime = time.time()
tar.addfile(tarinfo=dinfo)

# add a file to the new directory in the tar file
file2_contents = BytesIO("hello 2".encode())
finfo2 = tarfile.TarInfo(name='dir/file2.txt')
finfo2.size = len(file2_contents.getvalue())
finfo2.mtime = time.time()
tar.addfile(tarinfo=finfo2, fileobj=file2_contents)

tar.close()

In particular, I updated octal syntax following PEP 3127 -- Integer Literal Support and Syntax, switched to BytesIO from io, used getvalue instead of buf, and used open instead of TarFile to show zipped output as in the example. (Context handler usage (with ... as tar:) would also work in both python2 and python3, but cut and paste didn't work with my python2 repl, so I didn't switch it.) Tested on python 2.7.15+ and python 3.7.3.

猫九 2024-12-30 10:34:59

查看 tar 文件格式 似乎是可行的。每个子目录中的文件都以相对路径名(例如dir1/file3.txt)作为其名称。

唯一的技巧是您必须在进入每个目录的文件之前定义每个目录(tar 不会动态创建必要的子目录)。您可以使用一个特殊标志将 tarfile 条目标识为目录,但出于遗留目的,tar 还接受名称以 / 结尾的文件条目作为代表目录,因此您应该能够使用相同的技术将 dir1/ 添加为零长度字符串中的文件。

Looking at the tar file format it seems doable. The files that go in each subdirectory get the relative pathname (e.g. dir1/file3.txt) as their name.

The only trick is that you must define each directory before the files that go into it (tar won't create the necessary subdirectories on the fly). There is a special flag you can use to identify a tarfile entry as a directory, but for legacy purposes, tar also accepts file entries having names that end with / as representing directories, so you should be able to just add dir1/ as a file from a zero-length string using the same technique.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文