霍夫曼编码如何从 dct 系数构造图像(jpeg)?

发布于 2025-01-08 05:24:38 字数 554 浏览 1 评论 0原文

我有一个 512x512 的图像,我尝试重新压缩它。这是将图像重新压缩为jpeg文件的步骤

    1) convert rgb to YCrCb
    2) perform down sampling on Cr and Cb
    2) convert YCrCb to DCT and Quantized according to chosen Quality
    3) perform Huffman Encoding on Quantized DCT

但是在霍夫曼编码之前我计算了DCT系数的数量,它是393216。除以64告诉我DCT块的数量(8x8),这将是6144。

现在我尝试计算像素域的 8x8 块的数量。 512/8=64 这给了我水平 64 个块和垂直 64 个块。 64 x 64 = 4096,这不等于 DCT 块的数量,而像素数量为 512x512 = 262144

我的问题是霍夫曼编码如何神奇地将 393216 个系数转换为 262144 个像素并获取每个像素值,并计算尺寸(512x512)压缩图像(jpeg)的。

预先感谢您。 :D

I have a 512x512 image and I tried to recompress it. Here's the steps for recompressing an image to jpeg file

    1) convert rgb to YCrCb
    2) perform down sampling on Cr and Cb
    2) convert YCrCb to DCT and Quantized according to chosen Quality
    3) perform Huffman Encoding on Quantized DCT

But before Huffman Encoding I counted the number of DCT coefficients and it is 393216. Dividing by it by 64 tells me the number of DCT block (8x8) which will be 6144.

Now I tried to count the number of 8x8 blocks for pixel domain. 512/8=64 which gives me 64 blocks horizontally and 64 blocks vertically. 64 x 64 = 4096 which is not equal to number of DCT blocks while the number of pixels are 512x512 = 262144

My Question is how does Huffman encoding magically transform 393216 coefficients to 262144 pixels and get each pixel values, and compute the dimension (512x512) of the compressed image(jpeg).

Thanks you in advance. :D

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

天荒地未老 2025-01-15 05:24:38

如果您的图像编码时没有进行颜色子采样,则 8x8 系数块与 8x8 颜色分量块的比例将为 1:1。每个 MCU(最小编码单元)为 8x8 像素,并具有 3 个 8x8 系数块。 512x512 像素 = 64x64 8x8 块 x 3(Y、Cr 和 Cb 各一个)= 12288 个系数块。

既然您说您对颜色进行了二次采样(我假设在两个方向上),那么您现在将为每个 MCU 有 6 个 8x8 块。在下图中,最左边的图显示了没有对颜色进行子采样的情况,最右边的图显示了两个方向上的子采样。在这种情况下,MCU 的大小将为 16x16 像素。每个 16x16 像素块需要 6 个 8x8 系数块来定义它(4 个 Y、1 个 Cr、1 个 Cb)。如果将图像划分为 16x16 MCU,则将拥有 32x32 MCU,每个 MCU 有 6 个 8x8 块,每个 MCU = 6144 个系数块。因此,为了回答您的问题,霍夫曼编码并不是改变系数的数量,而是改变颜色子采样。在 JPEG 图像中使用颜色子采样所产生的部分压缩是利用人类视觉系统的一个功能。我们的眼睛对亮度的变化比对色度的变化更敏感。

在此处输入图像描述

If your image was encoded with no color subsampling, then there would be a 1:1 ratio of 8x8 coefficient blocks to 8x8 color component blocks. Each MCU (minimum coded unit) would be 8x8 pixels and have 3 8x8 coefficient blocks. 512x512 pixels = 64x64 8x8 blocks x 3 (one each for Y, Cr and Cb) = 12288 coefficient blocks.

Since you said you subsampled the color (I assume in both directions), then you will now have 6 8x8 blocks for each MCU. In the diagram below, the leftmost diagram shows the case for no subsampling of the colors and the rightmost diagram shows subsampling in both directions. The MCU size in this case will be 16x16 pixels. Each 16x16 block of pixels will need 6 8x8 coefficient blocks to define it (4 Y, 1 Cr, 1 Cb). If you divide the image into 16x16 MCUs, you will have 32x32 MCUs each with 6 8x8 blocks per MCU = 6144 coefficient blocks. So, to answer your question, the Huffman encoding is not what's changing the number of coefficients, it's the color subsampling. Part of the compression which comes from using color subsampling in JPEG images is exploiting a feature of the human visual system. Our eyes are more sensitive to changes in luminance than chrominance.

enter image description here

笑看君怀她人 2025-01-15 05:24:38

霍夫曼编码不会将系数转换为像素或类似的东西。至少不是我正在考虑的霍夫曼编码。霍夫曼编码所做的就是获取令牌列表,并根据这些令牌的频率用更少的位来表示它们。

举个例子:你现在有标记 a、b、c 和 d

,未压缩,每个标记都需要 2 位(00、01、10 和 11)。

假设 a=00、b=01、c=10 和 d=11

aabaccda 将表示为 0000010010101100 16 位,

但使用霍夫曼编码,您将表示 a 用更少的位,因为它更常见,并且您可以用更多的位来表示 bd 因为它们不太常见,例如:

a=0、b=110、c=10、d=111,然后

aabaccda 将表示为 00110010101110 14 位

Huffman encoding doesn't transform coefficients to pixels or anything like that. At least not the Huffman encoding that I'm thinking of. All huffman encoding does, is it takes a list of tokens, and represents them with less bits based on the frequency of those tokens.

an example: you have tokens a, b, c, and d

now, uncompressed, each of your tokens would require 2 bits(00, 01, 10, and 11).

let's say a=00, b=01, c=10, and d=11

aabaccda would be represented as 0000010010101100 16 bits

but with Huffman encoding you'd represent a with less bits because it's more common, and you'd represent b and d with more because they're less common something to the extent of:

a=0, b=110, c=10, d=111 and then

aabaccda would be represented as 00110010101110 14 bits

意中人 2025-01-15 05:24:38

您的图像为 512x512 像素
Y 分量为 512x512,因此 262144 个像素转换为 262144 个 DCT 系数
Cb 和 Cr 分量被下采样 2,因此 256x256 像素分别变成 65536 个 DCT 系数。
所有 DCT 系数之和为 262144+65536+65536 = 393216。
霍夫曼与此无关。

Your image is 512x512 pixels
The Y component is 512x512 hence 262144 pixels turned into 262144 DCT coefficients
The Cb and Cr components are downsampled by 2 hence 256x256 pixels turned into 65536 DCT coefficients each.
The sum of all DCT coefficients is 262144+65536+65536 = 393216.
Huffman has nothing to do with this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文