动态时间规整算法对于嗡嗡声系统的查询有多合适?
我正在尝试通过嗡嗡声系统开发一个查询,并寻找一种有效的算法来将嗡嗡声查询的频率与数据库中的频率进行比较。似乎动态时间包裹会很合适,因为它可以处理不同的速度(节奏)。
- 但是,即使用户用不同的和弦哼唱,这也可以用于比较吗?换句话说,以不同的音高哼唱(例如,原始歌曲是 C 和弦……而用户用 E 和弦演唱)
- 是否有用 c# 编写的示例代码? (找到了一些matlab编码,但不幸的是我对matlab不熟悉)。或者至少有一个在这种情况下描述 dtw 的教程?
- 如果 DTW 不适合,是否有其他算法适合此目的? 非常感谢您的建议。提前致谢 :)
I'm trying to develop a query by humming system and looking for an efficient algorithm to compare frequencies of hummed queries against the frequencies in the database. Seems like Dynamic Time Wrapping will be suitable as it can deal with different speeds (tempo).
- But, can this be used for comparison even if the user hums in a different chord? In other words, hums in a different pitch (for example, the original song is in chord C… and the user sings it in chord E)
- Are there any sample code written in c#? (found some matlab coding, but unfortunately I'm not familiar with matlab). Or atleast a tutorial that describes dtw in regard to this context?
- If DTW is not suitable, are there any other algorithms that would be suitable for this purpose?
Your suggestions are much appreciated. Thanks in advance :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在平均律调音中(并不是说人类本身调音,而是作为模型)相邻音符(半音)之间的比率是 2 的 12 次方根,即 1.0595,这样 12 个半音组成一个八度,每个八度都是频率的两倍。无论某人以什么调哼唱,您都应该能够确定间隔他们通过考虑音符频率的比率来哼唱。
In Equal temperament tuning (not that humans are tuned per se, but as a model) the ratio between adjacent notes (half steps) is the 12th root of 2, or 1.0595, such that 12 half steps make up one octave, and each octave is a doubling of frequency. No matter what key someone hums a tune in you should be able to determine the intervals they are humming by considering the ratios of the frequencies of the notes.
1,在通过DTW比较两首曲子之前最好先标准化音高,我认为这在文献中称为音高转换。
2、我不确定是否有C#实现,https://github.com/EmilioMolina/QueryBySingingHumming,这里有一个它的 c/c++ 演示代码。
3,DTW是匹配两个时间序列的有效算法,但唯一的问题是计算成本,当我们做现实世界的系统时,我们必须找到一种方法来降低成本:a)找到一种方法来进行DTW-指数? b)找到一种更高效但不太准确的算法来缩小DTW的搜索范围?
1, It is better to normalize the pitch before compare two tunes by DTW, it is called pitch shifting in literature, I think.
2, I am not sure whether there is an C# implementation, https://github.com/EmilioMolina/QueryBySingingHumming, here is an c/c++ demo code for it.
3, DTW is effective algorithm in match two time series, but the only problem is the computational-cost, we have to find a way to lower the cost when we doing a real-world systems: a) find a way to do DTW-index? b)find a more efficient but not-so-accurate algorithm to narrow the searching scope for DTW?