两个音频信号之间的时间延迟估计

发布于 2024-10-17 02:43:02 字数 1020 浏览 4 评论 0原文

我有两个不同麦克风对同一信号进行的两次音频录制(例如,以 WAV 格式),但其中一个录制有延迟,例如几秒钟。

在某种波形查看器中查看这些信号时,很容易通过视觉识别这种延迟 - 即只需在每个信号中发现第一个可见峰值并确保它们具有相同的形状:


(来源:greycat.ru

但我该如何以编程方式做到这一点 - 找出这个延迟 (t) 是什么?两个数字化信号略有不同(因为麦克风不同、位置不同、ADC 设置等)。

我深入研究了一下,发现这个问题通常称为“时间延迟估计”,并且有多种解决方法 - 例如, 其中之一

但是是否有任何简单且现成的解决方案,例如可用的命令行实用程序、库或直接算法?

结论:我发现没有简单的实现,并且自己完成了一个简单的命令行实用程序 - 可以在 https://bitbucket.org/GreyCat/calc-sound-delay(GPLv3 许可)。它实现了维基百科中描述的非常简单的搜索最大值算法。

I have two audio recordings of a same signal by 2 different microphones (for example, in a WAV format), but one of them is recorded with delay, for example, several seconds.

It's easy to identify such a delay visually when viewing these signals in some kind of waveform viewer - i.e. just spotting first visible peak in every signal and ensuring that they're the same shape:


(source: greycat.ru)

But how do I do it programmatically - find out what this delay (t) is? Two digitized signals are slightly different (because microphones are different, were at different positions, due to ADC setups, etc).

I've digged around a bit and found out that this problem is usually called "time-delay estimation" and it has myriads of approaches to it - for example, one of them.

But are there any simple and ready-made solutions, such as command-line utility, library or straight-forward algorithm available?

Conclusion: I've found no simple implementation and done a simple command-line utility myself - available at https://bitbucket.org/GreyCat/calc-sound-delay (GPLv3-licensed). It implements a very simple search-for-maximum algorithm described at Wikipedia.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

一袭水袖舞倾城 2024-10-24 02:43:02

您正在寻找的技术称为互相关。这是一种非常简单的计算密集型技术,可用于解决各种问题,包括测量两个相似信号(信号不需要相同)之间的时间差(又名滞后)。

如果您对滞后值(或至少是预期的滞后值的范围)有一个合理的了解,那么您可以大大减少总计算量。如果您可以对所需的准确度设置明确的限制,则同上。

The technique you're looking for is called cross correlation. It's a very simple, if somewhat compute intensive technique which can be used for solving various problems, including measuring the time difference (aka lag) between two similar signals (the signals do not need to be identical).

If you have a reasonable idea of your lag value (or at least the range of lag values that are expected) then you can reduce the total amount of computation considerably. Ditto if you can put a definite limit on how much accuracy you need.

深爱成瘾 2024-10-24 02:43:02

遇到了同样的问题,但没有成功找到一个工具来自动同步视频/音频录制的开始,
我决定进行 syncstart (

它是一个命令行工具。其背后的基本代码是这样的:

import numpy as np
from scipy import fft
from scipy.io import wavfile
r1,s1 = wavfile.read(in1)
r2,s2 = wavfile.read(in2)
assert r1==r2, "syncstart normalizes using ffmpeg"
fs = r1
ls1 = len(s1)
ls2 = len(s2)
padsize = ls1+ls2+1
padsize = 2**(int(np.log(padsize)/np.log(2))+1)
s1pad = np.zeros(padsize)
s1pad[:ls1] = s1
s2pad = np.zeros(padsize)
s2pad[:ls2] = s2
corr = fft.ifft(fft.fft(s1pad)*np.conj(fft.fft(s2pad)))
ca = np.absolute(corr)
xmax = np.argmax(ca)
if xmax > padsize // 2:
    file,offset = in2,(padsize-xmax)/fs
else:
    file,offset = in1,xmax/fs

Having had the same problem and without success to find a tool to sync the start of video/audio recordings automatically,
I decided to make syncstart (github).

It is a command line tool. The basic code behind it is this:

import numpy as np
from scipy import fft
from scipy.io import wavfile
r1,s1 = wavfile.read(in1)
r2,s2 = wavfile.read(in2)
assert r1==r2, "syncstart normalizes using ffmpeg"
fs = r1
ls1 = len(s1)
ls2 = len(s2)
padsize = ls1+ls2+1
padsize = 2**(int(np.log(padsize)/np.log(2))+1)
s1pad = np.zeros(padsize)
s1pad[:ls1] = s1
s2pad = np.zeros(padsize)
s2pad[:ls2] = s2
corr = fft.ifft(fft.fft(s1pad)*np.conj(fft.fft(s2pad)))
ca = np.absolute(corr)
xmax = np.argmax(ca)
if xmax > padsize // 2:
    file,offset = in2,(padsize-xmax)/fs
else:
    file,offset = in1,xmax/fs
季末如歌 2024-10-24 02:43:02

一个非常简单的事情就是检查峰值是否超过某个阈值,A 线上的高峰和 B 线上的高峰之间的时间可能就是你的延迟。只需尝试对阈值进行一些修改,如果图表通常与您发布的图片一样清晰,那么您应该没问题。

A very straight forward thing todo is just to check if the peaks exceed some threshold, the time between high-peak on line A and high-peak on line B is probably your delay. Just try tinkering a bit with the thresholds and if the graphs are usually as clear as the picture you posted, then you should be fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文