并行化和 H.264(或者可能是任何压缩编解码器)?为什么这么难?
我对视频压缩的有限(可能是错误的?)理解是帧内是完全独立的。换句话说,内帧(关键帧)的所有图片数据被完整地存储在该帧中。以下帧间帧(我认为是 H.264 中的 P 帧和 B 帧)依赖于要“绘制”的其他帧中的数据。
如果这些帧内帧是完全独立的,为什么编码不是一个令人尴尬的并行问题?如果您有 N 个处理器和 X 个 I 帧,您可以将 X/N 个源块提供给每个处理器进行独立编码,然后最后将它们全部修补在一起,对吗?但情况似乎并非如此——或者至少,我还没有看到任何具有能够做到这一点的并行化的编码器。我不明白什么?
My limited (and probably wrong?) understanding of video compression is that intra-frames are completely independent. In other words, all the picture data for a intra frame (key frame) is stored in its entirety for that frame. It's the following inter frames (P and B frames in H.264, I think) that depend on the data in other frames to be "drawn."
If these intra frames are completely independent, why isn't encoding an embarrassingly parallel problem? If you had N processors and X I-frames, you could just give X/N chunks of the source to each processor to encode independently, then patch them all together at the end, right? But it seems that this isn't the case--or at least, I haven't seen any encoders that have the kind of parallelization that being able to do this would give. What am I not understanding?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先要考虑的是要将内部帧放置在哪里。为了获得最佳压缩效果,您需要明智地选择这一点,例如,优先选择场景更改到静态序列的中间。要找到最佳位置,您需要分析原始视频,这可以在额外的过程中完成(昂贵),也可以在编码时即时决定。
因此,要将流分成块,您要么需要进行额外的传递来分析它,要么只是任意地将其分割并损失一些压缩效率。
然后你必须考虑如何对这个原始视频进行编码。它要么从某个地方流入您的压缩程序,要么整个内容都在磁盘上可用。
如果它是流式传输,那么您就不走运了,因为您无法随机访问流的不同部分。是的,您可以缓冲它,但快速计算表明,这要么需要大量内存,要么您必须缓冲到磁盘,这导致了下一点:
如果您将整个原始文件存储在本地,那么您可以将各个部分分配给不同的进程或线程。除了你的问题现在是磁盘访问!考虑到原始 1080p、24fps 视频的数据速率约为每分钟 4 GB。通过单个进程对其进行编码,磁盘将被大量占用以提供原始数据。它甚至可能是该过程中最慢的部分(尽管可能不是,除非您的硬盘非常分散!)
现在考虑让 4 个进程访问同一个文件,所有进程都试图以极高的速率获取原始数据。该硬盘驱动器无法为编码器提供数据 - 薄弱环节不是缓慢的处理器,而是缓慢的数据访问。
因此,除非您有一些真正专业的工具包来存储未压缩的视频,否则将不同的部分进行并行编码尚不切实际。
The first thing to consider is where you want to put your intra frames. For the best compression you need to choose this wisely, preferring scene changes to the middle of a static sequence for example. To find the best places you need to analyse the raw video, which can either be done in an extra pass (expensive) or it can be decided on the fly as you encode.
So to parcel up the stream into chunks you either need to make an extra pass to analyse it or just arbitrarily divide it up and lose some compression efficiency.
Then you have to think about how you've got this raw video to encode. Either it's streaming into your compression program from somewhere or the whole thing is available on disk.
If it's streaming then you're out of luck as you don't have random access to different parts of the stream. Yes, you could buffer it but a quick calculation shows that this would either need a LOT of memory or you'd have to buffer to disk, which leads on to the next point:
If you have the whole raw file stored locally then you could parcel out portions to different processes or threads. Except that your problem would now be disk access! Consider that the data rate for raw 1080p, 24fps video is about 4 GB per minute. With a single process encoding it the disk will be heavily occupied providing the raw data. It could even be the slowest part of the process (though probably not unless your hard-drive is very fragmented!)
Now think about getting 4 processes to access the same file, all trying to grab the raw data at extremely high rates. This hard-drive just won't be able to keep the encoders fed with data - the weak link won't be slow processors but slow data access.
So unless you've got some really professional kit to store your uncompressed video, parcelling up different sections for parallel encoding just isn't practical yet.
你是对的。并行化使其工作速度更快。
事实上,x264编码器提供了并行编码能力。
http://www.videolan.org/developers/x264.html
You are right. Parallelization makes it work faster.
In fact, the x264 encoder provides parallel encoding ability.
http://www.videolan.org/developers/x264.html