我正在开发一个源过滤器,它通过 DirectShow 图表提供我们的软件捕获的视频/音频。我使视频工作相对轻松,但我现在尝试添加音频输出引脚被证明是一个相当大的挑战。我的具体问题是:音频渲染器在播放声音时是否会修改实际的参考时钟?
我看到视频播放非常不稳定。下面附上一段日志文件,看起来参考时钟偶尔会“停止”,而系统时间仍在滴答作响。这有道理吗?
我应该提到的一件事是,音频样本是 u-Law 8 kHz 8 位,每个数据包正好是 120 ms。复杂之处在于:当我们从网络接收音频数据时,它不带有时间信息,因此我们的软件会在收到数据包时分配一个样本时间戳。视频样本带有原始来源标记,因此它们是准确的。如果我忽略音频采样时间并简单地分配间隔 120 毫秒的采样时间戳,视频将流畅播放。问题是我仍然没有完全理解参考时钟和音频/视频渲染器之间的完整关系,真正让我困惑的是我们有另一个类似的源过滤器,它可以播放相同的数据而不会出现视频抖动(它没有日志记录,并且我没有机会添加任何内容来查看在这种情况下参考时钟是否也被修改)。
这是日志的那一部分:
Sys Clock (delta) StreamTime (delta) Drift between clocks:
------------------------------------------------------------------
15:54:40.755 (0.005) 1.838 (0.005) 0.000
15:54:40.761 (0.006) 1.844 (0.006) 0.000
15:54:40.889 (0.128) 1.972 (0.128) 0.000
15:54:40.894 (0.005) 1.977 (0.005) 0.000
15:54:40.899 (0.005) 1.982 (0.005) 0.000
15:54:40.903 (0.004) 1.986 (0.004) 0.000
15:54:40.931 (0.028) 2.014 (0.028) 0.000
15:54:40.936 (0.005) 2.019 (0.005) 0.000
15:54:41.019 (0.083) 2.080 (0.061) 0.022
15:54:41.175 (0.156) 2.080 (0.000) 0.178
15:54:41.181 (0.006) 2.080 (0.000) 0.184
15:54:41.190 (0.009) 2.080 (0.000) 0.193
15:54:41.197 (0.007) 2.080 (0.000) 0.200
15:54:41.202 (0.005) 2.080 (0.000) 0.205
15:54:41.210 (0.008) 2.080 (0.000) 0.213
15:54:41.216 (0.006) 2.080 (0.000) 0.219
15:54:41.220 (0.004) 2.080 (0.000) 0.223
15:54:41.313 (0.093) 2.080 (0.000) 0.316
15:54:41.317 (0.004) 2.080 (0.000) 0.320
15:54:41.408 (0.091) 2.116 (0.036) 0.375
15:54:41.412 (0.004) 2.120 (0.004) 0.375
15:54:41.432 (0.020) 2.140 (0.020) 0.375
15:54:41.436 (0.004) 2.144 (0.004) 0.375
15:54:41.439 (0.003) 2.147 (0.003) 0.375
I'm working on a source filter which feeds video/audio captured by our software through a DirectShow graph. I got the video working relatively painlessly, but I am now trying to add an audio output pin is proving to be quite a challenge. The specific question I have is: Does a audio renderer modify the actual reference clock as it is playing sound?
I'm seeing very jerky video playback. Attached below is a chunk of a log file, and it looks like once in a while the reference clock just "stops" while the system time keeps ticking. Does that make sense?
One thing I should mention is that audio samples are u-Law 8 kHz 8-bit and each packet is exactly 120 ms. Here's the complication: When we receive audio data from the network, it doesn't come with time information, so our software assigns a sample timestamp at the time that the packet was received. Video samples get stamped by the original source, so they are accurate. If I ignore audio sample times and simply assign sample timestamps 120 ms apart, video will play smoothly. The problem is that I'm still not fully understanding the complete relationship between the reference clock and audio/video renderers and what really puzzles me is that we have another similar source filter which plays the same data without jerking video (it doesn't have logging, and I didn't get a chance to add any to see if reference clock is also modified in that case).
This is that piece of the log:
Sys Clock (delta) StreamTime (delta) Drift between clocks:
------------------------------------------------------------------
15:54:40.755 (0.005) 1.838 (0.005) 0.000
15:54:40.761 (0.006) 1.844 (0.006) 0.000
15:54:40.889 (0.128) 1.972 (0.128) 0.000
15:54:40.894 (0.005) 1.977 (0.005) 0.000
15:54:40.899 (0.005) 1.982 (0.005) 0.000
15:54:40.903 (0.004) 1.986 (0.004) 0.000
15:54:40.931 (0.028) 2.014 (0.028) 0.000
15:54:40.936 (0.005) 2.019 (0.005) 0.000
15:54:41.019 (0.083) 2.080 (0.061) 0.022
15:54:41.175 (0.156) 2.080 (0.000) 0.178
15:54:41.181 (0.006) 2.080 (0.000) 0.184
15:54:41.190 (0.009) 2.080 (0.000) 0.193
15:54:41.197 (0.007) 2.080 (0.000) 0.200
15:54:41.202 (0.005) 2.080 (0.000) 0.205
15:54:41.210 (0.008) 2.080 (0.000) 0.213
15:54:41.216 (0.006) 2.080 (0.000) 0.219
15:54:41.220 (0.004) 2.080 (0.000) 0.223
15:54:41.313 (0.093) 2.080 (0.000) 0.316
15:54:41.317 (0.004) 2.080 (0.000) 0.320
15:54:41.408 (0.091) 2.116 (0.036) 0.375
15:54:41.412 (0.004) 2.120 (0.004) 0.375
15:54:41.432 (0.020) 2.140 (0.020) 0.375
15:54:41.436 (0.004) 2.144 (0.004) 0.375
15:54:41.439 (0.003) 2.147 (0.003) 0.375
发布评论
评论(2)
当声卡出现在图中时,通常会选择它作为参考时钟。其他过滤器(包括视频渲染器)使用它来确定何时显示其样本。并行使用系统时钟并不是一个好主意;您应该使用相同的参考时钟来同步。
如果您知道音频样本的实际长度,并且确定不会丢失任何样本(例如,您使用 TCP,而不是 UDP),那么只需分配连续的 120 毫秒时间间隔就是一个很好的解决方案。当样本从网络到达时从系统时钟获取时间戳是一个坏主意,因为它会引入由网络行为引起的随机时间偏移 - 你永远不知道网络数据包到达需要多长时间。
如果您有两个过滤器并想了解它们的时间有何不同,您可以安装 GraphEditPlus,插入示例过滤器之前/之后的抓取器,右键单击它并选择“观看抓取的样本”。它将显示所有时间戳和其他信息。另外,您可以右键单击图形窗口并选择“查看事件日志”。它也有帮助。
When a sound card is in the graph it is usually selected as the reference clock. Other filters, including the video renderer, use it to determine when to show their samples. Using the system clock in parallel is not a good idea; you should use the same reference clock to be in sync.
If you know the real length of your audio samples, and you're sure you don't lose any of them (for example, you use TCP, not UDP) then just assigning sequential 120 ms time intervals is a good solution. Taking timestamps from the system clock when a sample arrives from network is a bad idea because it will introduce random time shifts caused by the network behavior - you never really know how long will it take for a network packet to come.
If you have two filters and want to see how their timing is different you can install GraphEditPlus, insert a sample grabber before/after your filters, right click it and select "watch grabbed samples". It will show all the timestamps and other information. Also, you can right click the graph window and choose "see event log". It can also help.
要了解图表中的哪个时钟被用作参考时钟,并查看该时钟相对于本地 CPU 时钟的漂移(通过 QueryPerformanceCounter),请查看 DirectShow 过滤器ShowClk.ax。
To understand which clock in a graph is being used as the reference clock and to see the drift of this clock relative the local CPU clock (via QueryPerformanceCounter), check out the DirectShow filter ShowClk.ax.