两个音频文件波形的比较
我是一名 VC++ 开发人员,目前需要帮助来比较两个音频文件。假设我有两个波形文件,其中一个是使用另一个文件创建的,并进行了一些修改,例如降低响度等。
现在,我必须比较这些文件,看看修改后的文件是否几乎是原始文件的副本,意味着,而创建修改后的文件时,我的应用程序在不知不觉中没有扭曲该文件。
百分比值可以更好地表示这些文件的差异程度。 我尝试对两个文件进行 FFT,然后计算 dB 差异(例如 10 * log10 (ft1/ft2))并对结果求平均值。我得到一个结果数字,但我不确定该数字意味着什么。
预先感谢您的任何帮助。
I am a VC++ developer and currently need help in comparing two audio files. Lets say I have two wave files and one of it is created using the other with some modifications like lowering the loudness etc.
Now, I have to compare these files and see if the modified one is almost copy of the original one, means, while creating the modified one, my application has unknowingly not distorted the file.
A percentage value would be better to signify how much different these files are.
I have tried taking FFT of both files and then computing the difference in dB (like, 10 * log10 (ft1/ft2)) and averaging out the result. i get a resultant number but I am not sure what that number signifies.
Thanks in advance for any kind of help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
除了光谱的相似程度之外,您获得的数字实际上并没有任何意义。修改音频文件的方法有很多,比较它们的方法也有很多,因此不可能给出一个通用的答案。如果您确切地知道进行了哪些修改,那么您就可以完成合理的工作。例如,如果您知道唯一的修改是音量已改变了一个常数因子,那么如果您采用 FFT 的平方幅度并将其标准化(即重新缩放以使峰值为 1.0),那么这对于原始信号和修改后的信号。您可以计算两个 FFT 幅度之差的总和,但这只是一个数字,您无法以任何有意义的方式将其转换为百分比(如果我说两个声音相差 30%,这意味着什么?)
因此,我会退后一步,解决您实际上想要解决的问题。
The number you obtain doesn't really signify anything aside from how similar the spectra are. There are so many ways of modifying an audio file and so many ways of comparing them that it's impossible to give a general answer. If you know exactly which modifications are made then you can do a reasonable job. For instance if you know that the only modification is that the volume has been changed by a constant factor then if you take the squared magnitude of the FFT and normalise it (ie rescale so that the peak is 1.0) then this will be identical for the original and modified signals. You can calculate the sum of the differences of the two FFT magnitudes, but this is just a number and you can't convert it to a percentage in any meaningful way (what does it mean if I say two sounds are 30% different?)
So I would step back a bit and work out the problem that you're actually trying to solve.