为什么 gzip/deflate 压缩小文件会导致许多尾随零?

发布于 2024-09-13 18:39:14 字数 1291 浏览 4 评论 0原文

我使用以下代码在 C# 中压缩一个小 (~4kB) HTML 文件。

byte[] fileBuffer = ReadFully(inFile, ResponsePacket.maxResponsePayloadLength); // Read the entire requested HTML file into a memory buffer
inFile.Close();                                                                 // Close the requested HTML file

byte[] payload;
using (MemoryStream compMS = new MemoryStream())                                       // Create a new memory stream to hold the compressed HTML data
{
    using (GZipStream gzip = new GZipStream(compMS, CompressionMode.Compress))            // Create a new GZip object pointing to the empty memory stream
    {
        gzip.Write(fileBuffer, 0, fileBuffer.Length);                                   // Compress the file buffer and write it to the empty memory stream
        gzip.Close();                                                                   // Close the GZip object
    }
    payload = compMS.GetBuffer();                                            // Write the compressed file buffer data in the memory stream to a byte buffer
}

生成的压缩数据约为 2k,但其中大约一半只是零。这是针对带宽非常敏感的应用程序(这就是为什么我首先要费心压缩 4kB),因此额外的 1kB 零浪费了宝贵的空间。我最好的猜测是压缩算法将数据填充到块边界。如果是这样,有什么方法可以覆盖此行为或更改块大小?我使用 vanilla .NET GZipStream 和 zlib 的 GZipStream 以及 DeflateStream 得到了相同的结果。

I'm using the following code to compress a small (~4kB) HTML file in C#.

byte[] fileBuffer = ReadFully(inFile, ResponsePacket.maxResponsePayloadLength); // Read the entire requested HTML file into a memory buffer
inFile.Close();                                                                 // Close the requested HTML file

byte[] payload;
using (MemoryStream compMS = new MemoryStream())                                       // Create a new memory stream to hold the compressed HTML data
{
    using (GZipStream gzip = new GZipStream(compMS, CompressionMode.Compress))            // Create a new GZip object pointing to the empty memory stream
    {
        gzip.Write(fileBuffer, 0, fileBuffer.Length);                                   // Compress the file buffer and write it to the empty memory stream
        gzip.Close();                                                                   // Close the GZip object
    }
    payload = compMS.GetBuffer();                                            // Write the compressed file buffer data in the memory stream to a byte buffer
}

The resulting compressed data is about 2k, but about half of it is just zeroes. This is for a very bandwidth sensitive application (which is why I'm bothering to compress 4kB in the first place), so the extra 1kB of zeroes is wasted valuable space. My best guess would be that the compression algorithm is padding out the data to a block boundary. If so, is there any way to override this behavior or change the block size? I get the same results with vanilla .NET GZipStream and zlib's GZipStream, as well as DeflateStream.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

羁绊已千年 2024-09-20 18:39:15

MemoryStream 方法错误。 GetBuffer() 返回底层缓冲区,它始终比流中的数据大(或完全一样大)。非常有效,因为不需要复制。

但这里需要 ToArray() 方法。或者使用 Length 属性。

Wrong MemoryStream method. GetBuffer() returns the underlying buffer, it is always larger (or exactly as large) as the data in the stream. Very efficient because no copy needs to be made.

But you need the ToArray() method here. Or use the Length property.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文