当程序写入/读取文件时，如何透明地压缩/解压缩文件？

发布于 2024-07-17 08:09:51 字数 364 浏览 18 评论 0原文

我有一个程序可以读取和写入非常大的文本文件。然而，由于这些文件的格式（它们是二进制数据的 ASCII 表示），这些文件实际上很容易被压缩。例如，其中一些文件的大小超过 10GB，但 gzip 实现了 95% 的压缩。

我无法修改程序，但磁盘空间很宝贵，因此我需要设置一种方法，使其可以在透明压缩和解压缩这些文件的同时读取和写入这些文件。

该程序只能读取和写入文件，因此据我了解，我需要为输入和输出设置一个命名管道。有些人建议使用压缩文件系统，这似乎也可行。我怎样才能使这两者发挥作用？

技术信息：我使用的是现代 Linux。该程序读取单独的输入和输出文件。它按顺序读取输入文件，但读取两次。它按顺序写入输出文件。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱你是孤单的心事 2024-07-24 08:09:51

查看 zlibc：http://zlibc.linux.lu/。

另外，如果可以选择 FUSE（即内核不太旧），请考虑：compFUSEd http://www.biggerbytes。是/

回复收藏 0 原文

ぇ气 2024-07-24 08:09:51

命名管道不会为您提供全双工操作，因此如果您只需要提供一个文件名，则会有点复杂。

您知道您的应用程序是否需要查找文件吗？

您的应用程序可以使用 stdin、stdout 吗？

也许一个解决方案是创建一个迷你压缩文件系统，其中仅包含一个包含文件的目录，

因为您有单独的输入和输出文件，您可以执行以下操作：

mkfifo readfifo
mkfifo writefifo
zcat your inputfile > readfifo &
gzip writefifo > youroutputfile &

launch your program !

现在，您可能会遇到按输入顺序读取两次的麻烦，因为一旦zcat完成读取输入文件，你的程序就会得到一个SIGPIPE信号。

正确的解决方案可能是使用像CompFUSE这样的压缩文件系统，因为这样你就不必担心像seek这样不支持的操作。

named pipes won't give you full duplex operations, so it will be a little bit more complicated if you need to provide just one filename.

Do you know if your applications needs to seek through the file ?

Does your application work with stdin, stdout ?

Maybe a solution is to create a mini compressed file system that contains only a directory with your files

Since you have separate input and output file you can do the following :

mkfifo readfifo
mkfifo writefifo
zcat your inputfile > readfifo &
gzip writefifo > youroutputfile &

launch your program !

Now, you probably will get in trouble with the read twice in order of the input, because as soon as zcat is finished reading the input file, yout program will get a SIGPIPE signal

The proper solution is probably to use a compressed file system like CompFUSE, because then you don't have to worry about unsupported operations like seek.

回复收藏 0 原文