霍夫曼编码如何从 dct 系数构造图像(jpeg)?
我有一个 512x512 的图像,我尝试重新压缩它。这是将图像重新压缩为jpeg文件的步骤
1) convert rgb to YCrCb
2) perform down sampling on Cr and Cb
2) convert YCrCb to DCT and Quantized according to chosen Quality
3) perform Huffman Encoding on Quantized DCT
但是在霍夫曼编码之前我计算了DCT系数的数量,它是393216。除以64告诉我DCT块的数量(8x8),这将是6144。
现在我尝试计算像素域的 8x8 块的数量。 512/8=64 这给了我水平 64 个块和垂直 64 个块。 64 x 64 = 4096,这不等于 DCT 块的数量,而像素数量为 512x512 = 262144
我的问题是霍夫曼编码如何神奇地将 393216 个系数转换为 262144 个像素并获取每个像素值,并计算尺寸(512x512)压缩图像(jpeg)的。
预先感谢您。 :D
I have a 512x512 image and I tried to recompress it. Here's the steps for recompressing an image to jpeg file
1) convert rgb to YCrCb
2) perform down sampling on Cr and Cb
2) convert YCrCb to DCT and Quantized according to chosen Quality
3) perform Huffman Encoding on Quantized DCT
But before Huffman Encoding I counted the number of DCT coefficients and it is 393216. Dividing by it by 64 tells me the number of DCT block (8x8) which will be 6144.
Now I tried to count the number of 8x8 blocks for pixel domain. 512/8=64 which gives me 64 blocks horizontally and 64 blocks vertically. 64 x 64 = 4096 which is not equal to number of DCT blocks while the number of pixels are 512x512 = 262144
My Question is how does Huffman encoding magically transform 393216 coefficients to 262144 pixels and get each pixel values, and compute the dimension (512x512) of the compressed image(jpeg).
Thanks you in advance. :D
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您的图像编码时没有进行颜色子采样,则 8x8 系数块与 8x8 颜色分量块的比例将为 1:1。每个 MCU(最小编码单元)为 8x8 像素,并具有 3 个 8x8 系数块。 512x512 像素 = 64x64 8x8 块 x 3(Y、Cr 和 Cb 各一个)= 12288 个系数块。
既然您说您对颜色进行了二次采样(我假设在两个方向上),那么您现在将为每个 MCU 有 6 个 8x8 块。在下图中,最左边的图显示了没有对颜色进行子采样的情况,最右边的图显示了两个方向上的子采样。在这种情况下,MCU 的大小将为 16x16 像素。每个 16x16 像素块需要 6 个 8x8 系数块来定义它(4 个 Y、1 个 Cr、1 个 Cb)。如果将图像划分为 16x16 MCU,则将拥有 32x32 MCU,每个 MCU 有 6 个 8x8 块,每个 MCU = 6144 个系数块。因此,为了回答您的问题,霍夫曼编码并不是改变系数的数量,而是改变颜色子采样。在 JPEG 图像中使用颜色子采样所产生的部分压缩是利用人类视觉系统的一个功能。我们的眼睛对亮度的变化比对色度的变化更敏感。
If your image was encoded with no color subsampling, then there would be a 1:1 ratio of 8x8 coefficient blocks to 8x8 color component blocks. Each MCU (minimum coded unit) would be 8x8 pixels and have 3 8x8 coefficient blocks. 512x512 pixels = 64x64 8x8 blocks x 3 (one each for Y, Cr and Cb) = 12288 coefficient blocks.
Since you said you subsampled the color (I assume in both directions), then you will now have 6 8x8 blocks for each MCU. In the diagram below, the leftmost diagram shows the case for no subsampling of the colors and the rightmost diagram shows subsampling in both directions. The MCU size in this case will be 16x16 pixels. Each 16x16 block of pixels will need 6 8x8 coefficient blocks to define it (4 Y, 1 Cr, 1 Cb). If you divide the image into 16x16 MCUs, you will have 32x32 MCUs each with 6 8x8 blocks per MCU = 6144 coefficient blocks. So, to answer your question, the Huffman encoding is not what's changing the number of coefficients, it's the color subsampling. Part of the compression which comes from using color subsampling in JPEG images is exploiting a feature of the human visual system. Our eyes are more sensitive to changes in luminance than chrominance.
霍夫曼编码不会将系数转换为像素或类似的东西。至少不是我正在考虑的霍夫曼编码。霍夫曼编码所做的就是获取令牌列表,并根据这些令牌的频率用更少的位来表示它们。
举个例子:你现在有标记 a、b、c 和 d
,未压缩,每个标记都需要 2 位(00、01、10 和 11)。
假设 a=00、b=01、c=10 和 d=11
aabaccda
将表示为0000010010101100
16 位,但使用霍夫曼编码,您将表示
a
用更少的位,因为它更常见,并且您可以用更多的位来表示b
和d
因为它们不太常见,例如:a=0、b=110、c=10、d=111,然后
aabaccda
将表示为00110010101110
14 位Huffman encoding doesn't transform coefficients to pixels or anything like that. At least not the Huffman encoding that I'm thinking of. All huffman encoding does, is it takes a list of tokens, and represents them with less bits based on the frequency of those tokens.
an example: you have tokens a, b, c, and d
now, uncompressed, each of your tokens would require 2 bits(00, 01, 10, and 11).
let's say a=00, b=01, c=10, and d=11
aabaccda
would be represented as0000010010101100
16 bitsbut with Huffman encoding you'd represent
a
with less bits because it's more common, and you'd representb
andd
with more because they're less common something to the extent of:a=0, b=110, c=10, d=111 and then
aabaccda
would be represented as00110010101110
14 bits您的图像为 512x512 像素
Y 分量为 512x512,因此 262144 个像素转换为 262144 个 DCT 系数
Cb 和 Cr 分量被下采样 2,因此 256x256 像素分别变成 65536 个 DCT 系数。
所有 DCT 系数之和为 262144+65536+65536 = 393216。
霍夫曼与此无关。
Your image is 512x512 pixels
The Y component is 512x512 hence 262144 pixels turned into 262144 DCT coefficients
The Cb and Cr components are downsampled by 2 hence 256x256 pixels turned into 65536 DCT coefficients each.
The sum of all DCT coefficients is 262144+65536+65536 = 393216.
Huffman has nothing to do with this.