使用Python中的互相关比较两个音频(.wav)文件
我需要比较两个音频文件以检查它们之间的相似性。因此,我已经使用Python使用了互相关方法。这里是我的代码:
from scipy.io import wavfile
from scipy import signal
import numpy as np
sample_rate_a, data_a = wavfile.read('new.wav')
sample_rate_b, data_b = wavfile.read('result.wav')
data_a = np.float32(data_a)
data_b = np.float32(data_b)
corr = signal.correlate(data_a, data_b)
lags = signal.correlation_lags(len(data_a), len(data_b))
corr = corr / np.max(corr)
def Average(l):
avg = sum(l) / len(l)
return avg
average = Average(corr)
lag = lags[np.argmax(corr)]
print(corr)
print("Lag =",lag, "np max=", np.max(corr))
print("np.min=",np.min(corr))
print("Average of my_list is",abs(average))
我打印了几个值,例如归一化的相关值,滞后和其归一化的最小值和最大值的平均值,以了解我的输出。这是我的输出:
[-3.5679664e-09 -1.1893221e-09 2.3786442e-09 ... 1.1893221e-09
-1.1893221e-09 -4.7572883e-09]
Lag = 2886023 np max= 1.0
np.min= -1.8993026
Average of my_list is 6.370856069729521e-05
我对此输出有些困惑,因为我无法理解这些值的含义。谁能帮助我弄清楚这些输出值是什么?对于两个音频文件的相似性,我只需要获得一个百分比值。
谢谢
I need to compare two audio files to check the similarity between them. So that I have used the cross-correlation method using python.Here is my code:
from scipy.io import wavfile
from scipy import signal
import numpy as np
sample_rate_a, data_a = wavfile.read('new.wav')
sample_rate_b, data_b = wavfile.read('result.wav')
data_a = np.float32(data_a)
data_b = np.float32(data_b)
corr = signal.correlate(data_a, data_b)
lags = signal.correlation_lags(len(data_a), len(data_b))
corr = corr / np.max(corr)
def Average(l):
avg = sum(l) / len(l)
return avg
average = Average(corr)
lag = lags[np.argmax(corr)]
print(corr)
print("Lag =",lag, "np max=", np.max(corr))
print("np.min=",np.min(corr))
print("Average of my_list is",abs(average))
I have printed several values such as normalized correlation values,lag and the average of its normalized min and max values to get an idea of my output. here is my output:
[-3.5679664e-09 -1.1893221e-09 2.3786442e-09 ... 1.1893221e-09
-1.1893221e-09 -4.7572883e-09]
Lag = 2886023 np max= 1.0
np.min= -1.8993026
Average of my_list is 6.370856069729521e-05
I am a bit confused about this output because I can not understand the meaning of these values. Can anyone help me to figure out what are these output values? I need to get only a percentage value for the similarity of the two audio files.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不知道如何解释您的输出,但是在下面有一个代码,可以使用Python获得两个音频文件的相似性,从而获得100到100的数字,它可以通过从音频文件中生成指纹并使用它们基于它们进行比较。互动相关性
需要 chromaprint 和 ffmpeg 已安装,它也不适用于简短的音频文件,如果这是一个问题,您始终可以降低音频的速度,例如指南请注意,这将增加一点噪音。
代码从: https://shivama20505.medium.com/audioum.com/audio-signalals, - 比较-23E431ED2207
I don't know how to interprete your output, but below there's a code to get a number from 0 to 100 for the similarity from two audio files using python, it works by generating fingerprints from audio files and comparing them based out of them using cross correlation
It requires Chromaprint and FFMPEG installed, also it doesn't work for short audio files, if this is a problem, you can always reduce the speed of the audio like in this guide, be aware this is going to add a little noise.
Code converted into python 3 from: https://shivama205.medium.com/audio-signals-comparison-23e431ed2207