研究 H.261 规范对于介绍现代视频压缩技术是否有用,或者我应该从其他地方开始?我不知道从哪里开始,但 H.261 似乎足够简单,可以轻松掌握这些概念。
Is it useful to study the H.261 specification for an introduction into modern video compression technology, or should I start somewhere else? I'm not sure where to start, but H.261 seems simple enough to make it easy to grasp the concepts.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

该规范并不是一个很好的介绍——它主要是为了精确而写的,并且几乎没有解释为什么事情是这样的。 H.261 本质上与 MPEG-1 相同。我用过的一本书(并且发现写得很好)是MPEG 视频压缩标准,作者是 Mitchell、Pennebaker、Fogg 和 LeGall。 FWIW,这涵盖了 MPEG-1 和 MPEG-2(分别称为 h.261 和 h.262)。
The specification isn't a very good introduction -- it's written primarily to be precise, and contains little explanation about why things are the way they are. H.261 is essentially the same as MPEG-1. One book I've used (and find quite well written) is MPEG Video compression stanadard, by Mitchell, Pennebaker, Fogg and LeGall. FWIW, this covers both MPEG-1 and MPEG-2 (aka h.261 and h.262 respectively).
我部分同意杰里·科芬的观点;我认为 H.261 对于任何学习视频压缩的人来说绝对是一个很好的起点,但直接阅读规范并不是一个好主意。
我重点关注的 H.261 基本构建块是运动补偿、宏块、减少空间冗余的 DCT 和减少时间冗余的差分 PCM (DPCM)。
如果出于学习目的我必须选择一种视频压缩的一般原则,那么我会从运动估计和运动补偿开始。尝试这个思维练习:想象两个连续的视频帧仅相隔 1/30 秒。他们会很相似,对吧?如果不上网,您会如何利用帧 1 中编码的信息来减少帧 2 的代码长度?现在,去搜索运动估计。
接下来,您将如何减少空间冗余? H.261 使用类似 JPEG 的内容并使用 DCT。
编辑:来自 Wang、Osterman 和 Zhang(第 293-4 页,关于基于块的混合视频编码,H.261 本质上就是这样):
I partially agree with Jerry Coffin; I think H.261 is definitely a good starting point for anyone learning about video compression, but reading the specification directly is not a good idea.
The basic building blocks from H.261 that I would focus on are motion compensation, macroblocks, DCT to reduce spatial redundancy, and differential PCM (DPCM) to reduce temporal redundancy.
If I had to choose one general principle of video compression for learning purposes, start with motion estimation and motion compensation. Try this thought exercise: imagine two consecutive video frames separated by only 1/30 of a second. They will be pretty similar, right? Without peeking at the Internet, what would you do to exploit the information encoded in frame 1 to reduce the code length of frame 2? Now, go search for motion estimation.
Next, how would you reduce spatial redundancy? H.261 uses something like JPEG and uses the DCT.
Edit: From Wang, Osterman, and Zhang (p.293-4 on block-based hybrid video coding which H.261 essentially is):