如何在.NET中进行压缩差异归档?
我的字节数组大小可能有几十兆字节。如此大的阵列可不是什么快乐的生物,尤其是当你有很多阵列时。所以我想压缩它们,这样它们更容易处理。它们压缩效果很好,通常压缩比为 3:1,DotNetZip 设置为 BestSpeed。
数组中的数据可以几乎相同。考虑到这一点,我希望找到某种方法以编程方式差异化地压缩数组,就像版本控制或备份软件一样。这样,如果我有三个 30 MB 的数组,仅在稀疏的地方有所不同,那么我的 zip 文件将接近 10 MB,而不是 30。
我已经在 google 和 stackoverflow 上尝试了许多查询,使用压缩、存档、备份等语言, diff、差异……我的术语都没有产生任何有用的东西。我应该寻找什么?
I have byte arrays that can be a few dozen megabytes in size. Such large arrays are not happy creatures, especially when you have a many of them. So I would like to compress them, so they're easier to deal with. They compress well, generally a 3:1 ratio with DotNetZip set to BestSpeed.
The data in the arrays can be nearly identical. With this consideration, I was hoping to find some way to programmatically compress the arrays differentially, much like version control or backup software. This way, if I have three arrays of 30 MB that differ only in sparse places, my zip file would be closer to 10 MB instead of 30.
I have tried many queries on google and stackoverflow, with language like compressed, archival, backup, diff, differential...none of my terms are turning up anything useful. What should I be looking for?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能想了解
rsync
协议如何在 Unix 上工作。它本质上计算两个文件之间的差异,并使用它来创建用于计算更改的压缩增量。您也许可以根据您想要做的事情进行调整。
You may want to look unto how the
rsync
protocol works on Unix. It essentially computes the differences between two files and uses that to create a compressed delta used to compute the changes.You may be able to adapt that to what you're trying to do.