音视频同步、TS MPEG2;H264/AVC、了解Handbrake中的PTS

发布于 2024-10-12 15:27:32 字数 1117 浏览 2 评论 0原文

同步一直让我着迷,或者准确地说:为什么媒体播放器可以同步观看 .ts,而重新组装的解复用音频+视频却不同步。

所以我试图了解这一点,以及可以采取哪些措施来防止这种情况发生。

我已阅读以下内容: https://trac.handbrake.fr/wiki/LibHandBrakeSync 和sync.c的来源(也可以在wiki上找到)

BitStreamTools也写了关于这个主题的理论101(但我无法链接,因为我是新用户,抱歉)

虽然我认为我对PCR/PTS的理解(概念上)是正确的,我很难遵循手刹出色的 A/V 同步纸。

我的问题是这样的:是否有一个有点直观的(可以是简短的、短的或更长的)解释音/视频同步?虽然我知道如果音频或视频点损坏(不连续?),可以从 PCR 重新计算 PTS,但手刹似乎并不依赖于此,而是依赖于它的内部 PTS。 0, += 1/fps (~=5), 10, 15, ....

是否可以通过修复所有音频和视频 PTS 值(并倾斜所有DTS 具有相同的偏移量,因此播放器不会“用完帧”,可以这么说),因此有一个可以解复用的 .ts,然后隔离的轨道会同步(如果放回一起)?

编辑: 或者是否无法通过使用 PCR 重新计算给定 .ts 中的所有 PTS 值来修复?虽然我知道某些帧/音频可能在广播中损坏,因此无法正确呈现,但我将保留对此的处理(例如,如果视频损坏并且具有相应的音频部分,则将其删除,如果音频包损坏等)稍后,为了讨论起见,我假设所有帧都完好无损。 (但是 PTS 值总是正确的,或者什么?)

附录: 我对手刹 A/V 纸的看法是这样的: 在“预期”100 处,偏移量计算为视频 pts (100) - 音频 pts (0) - 内部 PTS,以使音频达到相同的呈现时间,从而给出 99 的 pts 偏移量。在 105 处,偏移量将为 105-5 = 100,而不是 99,但我们继续使用 99 作为偏移量,因为无需重新计算(100-99 = 1. 1/fps < 100ms)。在 150 处,随着视频 pts 的减少而不是增加,再次计算 pts 偏移量...

我几乎肯定我对此完全错误,但是有人可以指出我正确的方向吗?

  • 乔什

Synchronization has always fascinated me, or to be precise: why a .ts can be viewed in sync by media players, while the demuxed audio+video reassembled is out of sync.

So I'm trying to understand this, and what can be done to prevent it.

I've read the following:
https://trac.handbrake.fr/wiki/LibHandBrakeSync
and the source of sync.c (also available on the wiki)

BitStreamTools have written a Theory 101 on the subject also (but I can't link as I'm a new user, sorry)

While I thought my understanding of PCR/PTS was (conceptually) right, I'm having a hard time following handbrake's excellent A/V sync paper.

My question is this: is there a somewhat intuitive (it can be brief, short or longer, as long) explanation of a/v synchronization? While I know that one can recalculate PTS from PCR if audio or video pts is corrupted (discontinuity?), handbrake does not seem to rely on this, but on it's internal PTS. 0, += 1/fps (~=5), 10, 15, ....

Would it be possible to recalculate the pts offsets and correct the .ts (binary) by fixing all audio and video PTS values (and skewing all DTS with the same offset, so the player doesn't "run out of frames", so to speak), and thus have a .ts which can be demuxed, and the isolated tracks then be in sync (if put back together)?

EDIT:
Or would it not be possible to fix by using PCR to recalculate all PTS values in a given .ts? While I understand that some frames/audio might be damaged in broadcast so it can not be presented correctly, I'll leave the handling of this (such as removing the video if it's damaged and has corresponding audio part, inserting x ms silence if the audio package is damaged etc.) to later, and for the sake of discussion I'll presume all frames are intact. (But then the PTS values would always be correct though, or what?)

Appendix:
My take on the handbrake A/V paper is this:
At "expected" 100, the offset is calculated as video pts (100) - audio pts (0) - the internal PTS, to bring the audio up to the same presentation time, thus giving a pts offset of 99. at 105 the offset would be 105-5 = 100, not 99, but we proceed to use 99 as offset since there's no need to recalculate (100-99 = 1. 1/fps < 100ms). At 150, the pts offset is calculated again as the video pts is decreasing, as opposed to increasing...

I'm almost positive I'm complete wrong about this, but can someone point me in the right direction, please?

  • Josh

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

征﹌骨岁月お 2024-10-19 15:27:32

音视频同步的概念要深刻得多。我推荐的第一个读物是下面的论文。

http://downloads.bbc.co.uk/rd/pubs /reports/1996-02.pdf

我不会在这里重复所有内容 - 但本质上,每个编码器都会记录时间戳并将其标记在相应的音频和视频上。随后,当解码器播放它时,它会做两件事 - 一,确保解码器自己的时钟被编码器的时钟“奴役”,第二,它确保每个图像都准确地呈现在屏幕上,并且音频帧准确地呈现在各自的时间上。时间发生。这是音频与视频保持同步的唯一也是最好的方式。这些时间戳称为 PTS/DTS 值,其时钟分辨率为 90 kHz。

了解随着时间的推移,时钟会出现偏差,但由于仅参考确切的时间,因此解码器完全按照相同的时间顺序播放。

现在主要的问题仍然是解码器的时钟需要保持编码器时钟的控制/同步。 MPEG 中所做的第一件事是在 27 MHz 下使用更高的精度(高出 300 倍)。此外,这需要在中间的任何传输路径期间保持一致。 (这称为时钟恢复过程)。

下面是另外几篇很好的论文,解释了时钟恢复/同步过程的工作原理。

https://www.soe .ucsc.edu/sites/default/files/technical-reports/UCSC-CRL-98-04.pdf
http://citeseerx.ist .psu.edu/viewdoc/download?doi=10.1.1.86.1016&rep=rep1&type=pdf

这篇最终论文将所有内容很好地结合在一起。
http://citeseerx.ist .psu.edu/viewdoc/download?doi=10.1.1.50.975&rep=rep1&type=pdf

请记住 - 基于 PCR 和 PTS/DTS 的音频视频同步使得数字电视广播非常严格并且与互联网流媒体中使用的任何其他流媒体方法有很大不同。这对于使其 24x7 流媒体正常运行至关重要。

The concept of Audio Video synchronization is much deeper. The first reading i would recommed is the following paper.

http://downloads.bbc.co.uk/rd/pubs/reports/1996-02.pdf

I won't repeat everything here - but essentially, every encoder records timestamps and stamps it on the respective Audio and Video. Later on, when decoder plays it, it does two things - one, ensures that decoder's own clock is "enslaved" with encoder's clock, and two it ensures that every picture is presented on the screen and audio frame presented to speaker exactly when that respective time occurs. This is only and best way that audio remains in synchronization with video. These timestamps are called PTS/DTS values which are of resolution of 90 kHz clock.

Understand that over time clocks skew but since only the exact time is referenced, decoder playout exactly in same time order.

Now the major concern remains is that decoder's clock needs to remain in control/synchronization of encoder's clock. The first thing done in MPEG is using a higher precision at 27 MHz, (300 times higher). Further, this needs to remain consistent during any transmission path in the middle. (this is called clock recovery process).

Below are another couple of good paper that explains how clock recovery/synchronization process works.

https://www.soe.ucsc.edu/sites/default/files/technical-reports/UCSC-CRL-98-04.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.1016&rep=rep1&type=pdf

This final paper puts every thing together much nicely.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.50.975&rep=rep1&type=pdf

Remember - the PCR and PTS/DTS based audio video synchronization is what make Digital TV broadcast is very stringent and is far different from any other streaming methods used in Internet streaming. This is crucial to make it 24x7 streaming to function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文