如何修改 gzip 压缩的 tar 文件中的文件?
我想编写一个(最好是 python)脚本来修改 gzipped tar 文件中一个文件的内容。该脚本必须在 FreeBSD 6+ 上运行。
基本上,我需要:
- 打开 tar 文件
- 如果 tar 文件中有 _MY_FILE_ ,则 :
- 如果 _MY_FILE_ 中有匹配 /RE/ 的行:
- 在匹配行后插入 LINE
- 将内容重写到 tar 文件中,保留除文件大小之外的所有元数据
我将对很多文件重复此操作。
Python 的 tarfile 模块在压缩时似乎无法打开 tar 文件进行读/写访问,这具有一定的意义。但是,我也找不到复制经过修改的 tar 文件的方法。
有没有简单的方法可以做到这一点?
I want to write a (preferably python) script to modify the content of one file in a gzipped tar file. The script must run on FreeBSD 6+.
Basically, I need to:
- open the tar file
- if the tar file has _MY_FILE_ in it:
- if _MY_FILE_ has a line matching /RE/ in it:
- insert LINE after the matching line
- rewrite the content into the tar file, preserving all metadata except the file size
I'll be repeating this for a lot of files.
Python's tarfile
module doesn't seem to be able to open tar files for read/write access when they're compressed, which makes a certain amount of sense. However, I can't find a way to copy the tar file with modifications, either.
Is there an easy way to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为 David Phillips 已经回答得很好,但这里有一些示例代码:
此代码将
input_tar_file
复制到output_tar_file
。如果您想修改某些内容,请从print()
调用开始。在那里,您可以检查输入、丢弃它、根据需要修改它。需要记住的事情:
info.size
中,另一个位置由file
流的长度隐式给出。I think David Phillips already answered quite well, but here's some example code on top:
This code does a copy of the
input_tar_file
to theoutput_tar_file
. If you want to modify things, start at theprint()
call. There, you can inspect the input, discard it, modify it as you desire.Things to keep in mind:
info.size
, the other is implicitly given by the length of thefile
stream.不要将 tar 文件视为可以读/写的数据库——事实并非如此。 tar 文件是文件的串联。要修改中间的文件,您需要重写该文件的其余部分。 (对于特定大小的文件,您可能能够利用块填充)
您想要做的是按文件处理 tarball 文件,将文件(经过修改)复制到新的 tarball 中。 Python tarfile 模块应该可以轻松做到这一点。您应该能够通过将属性从旧 TarInfo 对象复制到新对象来保留这些属性。
Don't think of a tar file as a database that you can read/write -- it's not. A tar file is a concatenation of files. To modify a file in the middle, you need to rewrite the rest of the file. (for files of a certain size, you might be able to exploit the block padding)
What you want to do is process the tarball file by file, copying files (with modifications) into a new tarball. The Python tarfile module should make this easy to do. You should be able to retain the attributes by copying them from the old TarInfo object to the new one.
我没有看到删除单个文件的简单方法。您可以轻松提取一个或全部文件,然后添加所需的任何文件。
我认为唯一的方法是:
在重新创建时读取文件时,请务必重置正确的格式
tarfile.USTAR_FORMAT
POSIX.1-1988 (ustar) 格式。
tarfile.GNU_FORMAT
GNU tar 格式。
tarfile.PAX_FORMAT
POSIX.1-2001 (pax) 格式。
tarfile.DEFAULT_FORMAT
http://docs.python.org/library/tarfile.html
I don't see an easy way to remove a single file. You can easily extract one or all, then add any files needed.
I think that the only way is:
Be sure to reset the correct format when you read it on re-creation
tarfile.USTAR_FORMAT
POSIX.1-1988 (ustar) format.
tarfile.GNU_FORMAT
GNU tar format.
tarfile.PAX_FORMAT
POSIX.1-2001 (pax) format.
tarfile.DEFAULT_FORMAT
http://docs.python.org/library/tarfile.html