Verizon SongID - 它是如何编程的?
对于不熟悉 Verizon 的 SongID 程序的人来说,它是一个可通过 Verizon 的 VCast 网络下载的免费应用程序。它会在歌曲播放过程中的任何时间点听一首歌 10 秒,然后将这些数据发送给一些无所不知的算法野兽,它会仔细分析它并向您发送回所有 ID3 标签(艺术家、专辑、歌曲等......)
前两部分和最后一部分很简单,但是录制的声音发送出去之后的处理过程是怎样的呢?
我认为它必须获取声音文件(什么格式?),解析它(如何?用什么?)一些关键标识符(这些是什么?波函数的常规属性?相位/位移/幅度/等),然后检查它针对数据库。
我在网上找到的有关其工作原理的所有内容都是通用的,就像我在上面输入的内容一样。
该服务基于 复杂的音频识别 结合先进音频的算法 指纹识别技术和大容量 歌曲数据库。当您上传 音频文件,它正在被分析 音频引擎。在分析过程中其 提取音频“指纹”并 通过将其与 音乐数据库。完成时 这个识别过程,信息 关于歌曲及其匹配 概率显示在屏幕上。
For anyone not familiar with Verizon's SongID program, it is a free application downloadable through Verizon's VCast network. It listens to a song for 10 seconds at any point during the song and then sends this data to some all-knowing algorithmic beast that chews it up and sends you back all the ID3 tags (artist, album, song, etc...)
The first two parts and last part are straightforward, but what goes on during the processing after the recorded sound is sent?
I figure it must take the sound file (what format?), parse it (how? with what?) for some key identifiers (what are these? regular attributes of wave functions? phase/shift/amplitude/etc), and check it against a database.
Everything I find online about how this works is something generic like what I typed above.
From audiotag.info
This service is based on a
sophisticated audio recognition
algorithm combining advanced audio
fingerprinting technology and a large
songs' database. When you upload an
audio file, it is being analyzed by an
audio engine. During the analysis its
audio “fingerprint” is extracted and
identified by comparing it to the
music database. At the completion of
this recognition process, information
about songs with their matching
probabilities are displayed on screen.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
所有这些服务的工作原理都是从客户端采样的音频数据中获取“指纹”,将其发送到服务器并将其与指纹数据库进行比较。
Shazam 的开发者之一撰写了一篇内容极其丰富的白皮书,介绍该技术的工作原理。这应该为您提供所需的所有信息。
All of these services work by taking a "fingerprint" from the sampled audio data on the client side, sending it to a server and comparing it against a fingerprint database.
One of the developers of Shazam has written an extremely informative white paper on how the technology works. This should give you all of the information that you need.