gzcompress 是如何工作的？

发布于 2024-09-08 14:33:14 字数 818 浏览 3 评论 0原文

我想知道为什么在使用 gzcompress() 后需要截掉最后 4 个字符。

这是我的代码：

header("Content-Encoding: gzip");
echo "\x1f\x8b\x08\x00\x00\x00\x00\x00";
$index = $smarty->fetch("design/templates/main.htm") ."\n<!-- Compressed by gzip -->";
$this->content_size = strlen($index);
$this->content_crc = crc32($index);
$index = gzcompress($index, 9);
$index = substr($index, 0, strlen($index) - 4); // Why cut off ??
echo $index;
echo pack('V', $this->content_crc) . pack('V', $this->content_size);

当我不剪切最后 4 个字符时，源代码的结尾如下：

[...]
<!-- Compressed by gzip -->N

当我将它们剪切掉时，它显示为：

[...]
<!-- Compressed by gzip -->

我只能在 Chrome 代码检查器中看到附加的 N（不在 Firefox 中，不在 IE 源中））。但代码末尾似乎有四个附加字符。

谁能解释一下，为什么我需要剪掉4个字符？

原文

I'm wondering about why I need to cut off the last 4 Characters, after using gzcompress().

Here is my code:

header("Content-Encoding: gzip");
echo "\x1f\x8b\x08\x00\x00\x00\x00\x00";
$index = $smarty->fetch("design/templates/main.htm") ."\n<!-- Compressed by gzip -->";
$this->content_size = strlen($index);
$this->content_crc = crc32($index);
$index = gzcompress($index, 9);
$index = substr($index, 0, strlen($index) - 4); // Why cut off ??
echo $index;
echo pack('V', $this->content_crc) . pack('V', $this->content_size);

When I don't cut of the last 4 chars, the source ends like:

[...]
<!-- Compressed by gzip -->N

When I cut them off it reads:

[...]
<!-- Compressed by gzip -->

I could see the additional N only in Chromes Code inspector (not in Firefox and not in IEs source). But there seams to be four additional characters at the end of the code.

Can anyone explain me, why I need to cut off 4 chars?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

昇り龍 2024-09-15 14:33:14

gzcompress 实现 ZLIB 压缩数据格式，具有以下结构：

     0   1
   +---+---+
   |CMF|FLG|   (more-->)
   +---+---+

(if FLG.FDICT set)

     0   1   2   3
   +---+---+---+---+
   |     DICTID    |   (more-->)
   +---+---+---+---+

   +=====================+---+---+---+---+
   |...compressed data...|    ADLER32    |
   +=====================+---+---+---+---+

在这里你可以看到最后四个字节是一个Adler-32 校验和。

与此相反，GZIP 文件格式是所谓的成员列表以下结构：

   +---+---+---+---+---+---+---+---+---+---+
   |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
   +---+---+---+---+---+---+---+---+---+---+

(if FLG.FEXTRA set)

   +---+---+=================================+
   | XLEN  |...XLEN bytes of "extra field"...| (more-->)
   +---+---+=================================+

(if FLG.FNAME set)

   +=========================================+
   |...original file name, zero-terminated...| (more-->)
   +=========================================+

(if FLG.FCOMMENT set)

   +===================================+
   |...file comment, zero-terminated...| (more-->)
   +===================================+

(if FLG.FHCRC set)

   +---+---+
   | CRC16 |
   +---+---+

   +=======================+
   |...compressed blocks...| (more-->)
   +=======================+

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   |     CRC32     |     ISIZE     |
   +---+---+---+---+---+---+---+---+

如您所见，GZIP 使用 CRC-32 校验和进行完整性检查。

因此，要分析您的代码：

echo "\x1f\x8b\x08\x00\x00\x00\x00\x00"; – 输出以下标头字段：
- 0x1f 0x8b – ID1和ID2，标识数据格式的标识符（这些是固定值）
- 0x08 – CM，使用的压缩方法； 8 表示使用 DEFLATE 数据压缩格式 (RFC 1951)
- 0x00 – FLG，标志
- 0x00000000 – MTIME，修改时间
- 字段 XFL（额外标志）和 OS（操作系统）由 DEFLATE 数据压缩格式设置
echo $index; – 根据 DEFLATE 数据压缩输出压缩数据格式
echo pack('V', $this->content_crc) 。 pack('V', $this->content_size); – 以二进制形式输出 CRC-32 校验和以及未压缩输入数据的大小

gzcompress implements the ZLIB compressed data format that has the following structure:

     0   1
   +---+---+
   |CMF|FLG|   (more-->)
   +---+---+

(if FLG.FDICT set)

     0   1   2   3
   +---+---+---+---+
   |     DICTID    |   (more-->)
   +---+---+---+---+

   +=====================+---+---+---+---+
   |...compressed data...|    ADLER32    |
   +=====================+---+---+---+---+

Here you see that the last four bytes is a Adler-32 checksum.

In contrast to that, the GZIP file format is a list of of so called members with the following structure:

   +---+---+---+---+---+---+---+---+---+---+
   |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
   +---+---+---+---+---+---+---+---+---+---+

(if FLG.FEXTRA set)

   +---+---+=================================+
   | XLEN  |...XLEN bytes of "extra field"...| (more-->)
   +---+---+=================================+

(if FLG.FNAME set)

   +=========================================+
   |...original file name, zero-terminated...| (more-->)
   +=========================================+

(if FLG.FCOMMENT set)

   +===================================+
   |...file comment, zero-terminated...| (more-->)
   +===================================+

(if FLG.FHCRC set)

   +---+---+
   | CRC16 |
   +---+---+

   +=======================+
   |...compressed blocks...| (more-->)
   +=======================+

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   |     CRC32     |     ISIZE     |
   +---+---+---+---+---+---+---+---+

As you can see, GZIP uses a CRC-32 checksum for the integrity check.

So to analyze your code:

echo "\x1f\x8b\x08\x00\x00\x00\x00\x00"; – puts out the following header fields:
- 0x1f 0x8b – ID1 and ID2, identifiers to identify the data format (these are fixed values)
- 0x08 – CM, compression method that is used; 8 denotes the use of the DEFLATE data compression format (RFC 1951)
- 0x00 – FLG, flags
- 0x00000000 – MTIME, modification time
- the fields XFL (extra flags) and OS (operation system) are set by the DEFLATE data compression format
echo $index; – puts out compressed data according to the DEFLATE data compression format
echo pack('V', $this->content_crc) . pack('V', $this->content_size); – puts out the CRC-32 checksum and the size of the uncompressed input data in binary