zlib解压失败

发布于 2024-08-04 07:27:49 字数 1264 浏览 6 评论 0原文

我正在编写一个应用程序,需要解压缩由另一个应用程序压缩的数据(这超出了我的控制范围 - 我无法更改它的源代码)。生产者应用程序使用 zlib 通过 z_stream 机制来压缩数据。它频繁使用 Z_FULL_FLUSH(在我看来可能太频繁了,但那是另一回事)。这个第三方应用程序还能够解压缩它自己的数据,所以我非常有信心数据本身是正确的。

在我的测试中,我使用此第三方应用程序来压缩以下简单文本文件(十六进制):

48 65 6c 6c 6f 20 57 6f 72 6c 64 21 0d 0a

我收到的压缩字节从应用程序看起来像这样(同样,以十六进制表示):

78 9c f2 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 00 00 ff ff

如果我尝试压缩相同的数据,我得到非常相似的结果:

78 9c f3 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 24 e9 04 55

我可以看到两个区别:

首先,第四个字节是 F2,而不是 F3,因此 deflate“最终块”位尚未设置。我认为这是因为流接口永远不知道传入数据何时结束,所以永远不会设置该位?

最后,外部数据中的最后四个字节是 00 00 FF FF,而在我的测试数据中它是 24 E9 04 55。搜索周围我发现在此页面

http://www.bolet.org/~pornin/ deflate-flush.html

...这是同步或完全刷新的签名。

当我尝试使用 decompress() 函数解压缩自己的数据时,一切正常。但是,当我尝试解压缩外部数据时,decompress() 函数调用失败,返回代码为 Z_DATA_ERROR,表明数据已损坏。

我有几个问题:

  1. 我是否应该能够使用 zlib“uncompress”函数来解压缩已使用 z_stream 方法压缩的数据?

  2. 在上面的例子中,最后四个字节的意义是什么?假设外部压缩的数据流和我自己的测试数据流的长度相同,那么我的最后四个字节代表什么?

干杯

I'm writing an application that needs to uncompress data compressed by another application (which is outside my control - I cannot make changes to it's source code). The producer application uses zlib to compress data using the z_stream mechanism. It uses the Z_FULL_FLUSH frequently (probably too frequently, in my opinion, but that's another matter). This third party application is also able to uncompress it's own data, so I'm pretty confident that the data itself is correct.

In my test, I'm using this third party app to compress the following simple text file (in hex):

48 65 6c 6c 6f 20 57 6f 72 6c 64 21 0d 0a

The compressed bytes I receive from the app look like this (again, in hex):

78 9c f2 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 00 00 ff ff

If I try and compress the same data, I get very similar results:

78 9c f3 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 24 e9 04 55

There are two differences that I can see:

First, the fourth byte is F2, rather than F3, so the deflate "final block" bit has not been set. I assume this is because the stream interface never knows when the end of the incoming data will be, so never sets that bit?

Finally, the last four bytes in the external data is 00 00 FF FF, whereas in my test data it is 24 E9 04 55. Searching around I found on this page

http://www.bolet.org/~pornin/deflate-flush.html

...that this is a signature of a sync or full flush.

When I try and decompress my own data using the decompress() function, everything works perfectly. However, when I try and decompress the external data the decompress() function call fails with a return code of Z_DATA_ERROR, indicating corrupt data.

I have a few questions:

  1. Should I be able to use the zlib "uncompress" function to uncompress data that has been compressed with the z_stream method?

  2. In the example above, what is the significance of the last four bytes? Given that both the externally compressed data stream and my own test data stream are the same length, what do my last four bytes represent?

Cheers

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

把时间冻结 2024-08-11 07:27:49

感谢 zlib 作者,我找到了答案。第三方应用程序正在生成未正确完成的 zlib 流:

<块引用>

78 9c f2 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 00 00 ff ff

这是一个部分 zlib 流,
由 zlib 标头和
部分放气流。有两个
块,都不是最后一个
堵塞。第二个块是空的
存储块,用作标记
冲洗。 zlib 解码器会
正确解码那里的内容,并且
然后继续查找数据之后
这些字节。

<块引用>

78 9c f3 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 24 e9 04 55

这是一个完整的 zlib 流,
由 zlib 标头、单个
标记为最后一个块的块,以及
zlib 预告片。预告片是
未压缩的 Adler-32 校验和
数据。

所以我的解压失败 - 可能是因为 CRC 丢失,或者解压代码不断寻找更多不存在的数据。

Thanks to the zlib authors, I have found the answer. The third party app is generating zlib streams that are not finished correctly:

78 9c f2 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 00 00 ff ff

That is a partial zlib stream,
consisting of a zlib header and a
partial deflate stream. There are two
blocks, neither of which is a last
block. The second block is an empty
stored block, used as a marker when
flushing. A zlib decoder would
correctly decode what's there, and
then continue to look for data after
those bytes.

78 9c f3 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 24 e9 04 55

That is a complete zlib stream,
consisting of a zlib header, a single
block marked as the last block, and a
zlib trailer. The trailer is the
Adler-32 checksum of the uncompressed
data.

So My decompression is failing - probably because the CRC is missing, or the decompression code keeps looking for more data that does not exist.

千鲤 2024-08-11 07:27:49

解决方案在这里:
http://technology.amis.nl/2010/03/ 13/utl_compress-gzip-and-zlib/

这是以78 9C签名开头的解压和压缩函数
压缩数据库 blob(或流)。

solution is here:
http://technology.amis.nl/2010/03/13/utl_compress-gzip-and-zlib/

this is decompression and compression functions for start with 78 9C signature
compressed database blob (or stream).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文