从编码图像和视频中提取 DCT 系数
有没有一种方法可以轻松地从编码图像和视频中提取 DCT 系数(和量化参数)?任何解码器软件都必须使用它们来解码块 DCT 编码的图像和视频。所以我很确定解码器知道它们是什么。有没有办法将它们暴露给使用解码器的人?
我正在实现一些直接在 DCT 域中工作的视频质量评估算法。目前,我的大部分代码都使用 OpenCV,因此如果有人知道使用该框架的解决方案,那就太好了。我不介意使用其他库(也许是 libjpeg,但这似乎仅适用于静态图像),但我主要关心的是尽可能少地执行特定于格式的工作(我不想重新发明轮子并编写我自己的解码器)。我希望能够打开 OpenCV 可以打开的任何视频/图像(H.264、MPEG、JPEG 等),并且如果它是块 DCT 编码的,以获得 DCT 系数。
在最坏的情况下,我知道我可以编写自己的块 DCT 代码,通过它运行解压缩的帧/图像,然后我会回到 DCT 域。这并不是一个优雅的解决方案,我希望我能做得更好。
目前,我使用相当常见的 OpenCV 样板来打开图像:
IplImage *image = cvLoadImage(filename);
// Run quality assessment metric
我用于视频的代码同样简单:
CvCapture *capture = cvCaptureFromAVI(filename);
while (cvGrabFrame(capture))
{
IplImage *frame = cvRetrieveFrame(capture);
// Run quality assessment metric on frame
}
cvReleaseCapture(&capture);
在这两种情况下,我都会得到 BGR 格式的 3 通道 IplImage
。有什么方法可以得到 DCT 系数吗?
Is there a way to easily extract the DCT coefficients (and quantization parameters) from encoded images and video? Any decoder software must be using them to decode block-DCT encoded images and video. So I'm pretty sure the decoder knows what they are. Is there a way to expose them to whomever is using the decoder?
I'm implementing some video quality assessment algorithms that work directly in the DCT domain. Currently, the majority of my code uses OpenCV, so it would be great if anyone knows of a solution using that framework. I don't mind using other libraries (perhaps libjpeg, but that seems to be for still images only), but my primary concern is to do as little format-specific work as possible (I don't want to reinvent the wheel and write my own decoders). I want to be able to open any video/image (H.264, MPEG, JPEG, etc) that OpenCV can open, and if it's block DCT-encoded, to get the DCT coefficients.
In the worst case, I know that I can write up my own block DCT code, run the decompressed frames/images through it and then I'd be back in the DCT domain. That's hardly an elegant solution, and I hope I can do better.
Presently, I use the fairly common OpenCV boilerplate to open images:
IplImage *image = cvLoadImage(filename);
// Run quality assessment metric
The code I'm using for video is equally trivial:
CvCapture *capture = cvCaptureFromAVI(filename);
while (cvGrabFrame(capture))
{
IplImage *frame = cvRetrieveFrame(capture);
// Run quality assessment metric on frame
}
cvReleaseCapture(&capture);
In both cases, I get a 3-channel IplImage
in BGR format. Is there any way I can get the DCT coefficients as well?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好吧,我读了一些书,我原来的问题似乎是一厢情愿的例子。
基本上,不可能从 H.264 视频帧中获取 DCT 系数,原因很简单,H.264 不使用 DCT。它使用不同的变换(整数变换)。接下来,该变换的系数不一定会逐帧变化 - H.264 更智能,因为它将帧分割成切片。应该可以通过特殊的解码器获得这些系数,但我怀疑 OpenCV 是否会向用户公开它。
对于 JPEG 来说,情况要乐观一些。正如我怀疑的那样, libjpeg 为您公开了 DCT 系数。我编写了一个小应用程序来证明它可以工作(源代码在最后)。它使用每个块的 DC 项生成一个新图像。由于 DC 项等于块平均值(经过适当缩放后),因此 DC 图像是输入 JPEG 图像的下采样版本。
编辑:固定源图像中的缩放比例
原始图像 (512 x 512):
DC 图像 ( 64x64): 亮度 Cr Cb RGB
源 (C++):
Well, I did a bit of reading and my original question seems to be an instance of wishful thinking.
Basically, it's not possible to get the DCT coefficients from H.264 video frames for the simple reason that H.264 doesn't use DCT. It uses a different transform (integer transform). Next, the coefficients for that transform don't necessarily change on a frame-by-frame basis -- H.264 is smarter cause it splits up frames into slices. It should be possible to get those coefficients through a special decoder, but I doubt OpenCV exposes it for the user.
For JPEG, things are a bit more positive. As I suspected, libjpeg exposes the DCT coefficients for you. I wrote a small app to show that it works (source at the end). It makes a new image using the DC term from each block. Because the DC term is equal to the block average (after proper scaling), the DC images are downsampled versions of the input JPEG image.
EDIT: fixed scaling in source
Original image (512 x 512):
DC images (64x64): luma Cr Cb RGB
Source (C++):
您可以使用libjpeg来提取jpeg文件的dct数据,但是对于h.264视频文件,我找不到任何为您提供dct数据的开源代码(实际上是整数DCT数据)。但您可以使用 h.264 开源软件,例如 JM、JSVM 或 x264。在这两个源文件中,您必须找到它们使用dct函数的特定函数,并将其更改为您想要的形式,以获得输出dct数据。
对于图像:
使用以下代码,在
read_jpeg_file( infilename, v, quant_tbl )
之后,v
和quant_tbl
将获得dct 数据
和 jpeg 图像的量化表
。我使用 Qvector 来存储我的输出数据,将其更改为您首选的 C++ 数组列表。
You can use, libjpeg to extract dct data of your jpeg file, but for h.264 video file, I can't find any open source code that give you dct data (actully Integer dct data). But you can use h.264 open source software like JM, JSVM or x264. In these two source file, you have to find their specific function that make use of dct function, and change it to your desire form, to get your output dct data.
For Image:
use the following code, and after
read_jpeg_file( infilename, v, quant_tbl )
,v
andquant_tbl
will havedct data
andquantization table
of your jpeg image respectively.I used Qvector to store my output data, change it to your preferred c++ array list.