有哪些用于从音频文件中提取数据的好库?
最近,我开始在 iPhone 上使用 Shazam 应用。 对于那些不知道的人来说,这个应用程序通过听正在播放的一小部分歌曲来识别歌曲。 我对它的准确性和速度感到惊讶,所以我决定进行一些挖掘。
我在此处找到了他们的一位开发人员写的论文。 在论文中,开发人员详细描述了 Shazam 中使用的指纹算法。
作为一个宠物项目,我想制作自己的歌曲指纹应用程序,这样我就可以获得一些音频编程的经验。
有哪些音频库可以帮助您提取音频剪辑或 mp3 歌曲在其持续时间内的频率、幅度和其他特征等信息?
我正在使用 .NET,但我对其他语言库持开放态度。 我对开源和付费库也很满意。 只要我能够可靠地以编程方式提取音频特征,我就会很高兴。
Recently I started to use the Shazam app on my iPhone. For those who don't know, this app identifies songs by listening to a small segment of the song playing. I was amazed by it's accuracy and speed so I decided to do a little digging.
I found a paper written by one of their developers here. In the paper the developer goes into a good amount of detail describing the fingerprintng algorithm used in Shazam.
As a pet project Id like to make my own song fingerprinting application so I can get some experience with audio programming.
What are some audio libraries that help you extract things like frequency, amplitude, and other characteristics of an audio clip or mp3 song over it's duration?
I'm using .NET but I'm open to other languages libraries. I'm also fine with both open source and paid libraries. As long as I can reliabably extract audio characteristics progmatically I'll be happy.
See also:
How Shazam Works
Shazam Journal Paper
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
尝试看看 NAudio。 它可能没有您预先寻找的所有音频分析功能,但它具有很强的可扩展性,如果您使用 .Net 语言,那么它是一个很好的起点。
Try having a look at NAudio. It may not have all the audio analysis that your looking for upfront but it is quite extensible and would be a good place to start if your using .Net languages.
要开始了解音频功能,您应该首先阅读本文。
许多实验室开发了自己的库来提取音频特征。
你可以看看 yafee、aubio、jaudio....
To start with audio features, you should first read this paper.
Many labs have developed their own libraries to extract audio features.
You can have a look at yafee, aubio, jaudio ....
ffmpeg 库支持很多音频编解码器,但恕我直言,与之交互相当痛苦。
为了提取音频属性,您应该考虑一个适合信号分析的合适的库。 例如,您特别需要快速傅里叶变换 (FTT),以从音频样本中提取频率数据。 搜索给出了关于该主题的大量结果。
/edit:对于.NET,我相信有一个 ffmpeg 接口。 您还可以找到适用于 .NET 的信号分析工具。
The
ffmpeg
library supports a lot of audio codecs, but it's quite a pain to interface with, IMHO.For extracting audio properties, you should consider a decent library suited for signal analysis. You will especially need the Fast Fourier Transformation (FTT), for example, to extract frequency data out of your audio samples. A search gives a lot of results on that topic.
/edit: For .NET, I am confident there is a
ffmpeg
interface. You will find signal analysis tools for .NET, too.