We don’t allow questions seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(4)
这个领域被称为机器监听。
数字编码音乐的复调转录是机器聆听的圣杯之一。 这是一个尚未解决的问题,也是一个活跃的研究领域。 子字段包括:
根据项目的性质,您可能会发现探索 SuperCollider 编程环境很有用。 SC 是专为此类项目设计的语言,已经拥有大量机器监听插件 (ugens),以及用于处理 FFT、音频信号等的综合框架。
This field is known as machine listening.
Polyphonic transcription of digitally encoded music is one of the holy grails of machine listening. It is an unsolved problem, and an area of active research. The sub-fields include:
Depending on the nature of your project, you might find it useful to explore the SuperCollider programming environment. SC is a language designed for projects such as this, already has a large number of machine listening plugins (ugens), and a comprehensive framework for dealing with FFT, audio signals, and much more.
这个有关音符开始检测的问题包含大量可能对您有用的信息。
这听起来是一个巨大但非常有趣的项目,祝你好运。
This question about note onset detection contains a lot of information which may be useful to you.
This sounds a huge but very interesting project, good luck to you.
音乐转录是指从声音(或音频数据)创建乐谱。 虽然有成就的音乐家,尤其是作曲家能够做到这一点,但用机器来完成这是一项极其困难的任务,而且据我所知,到目前为止还没有取得什么成功——主要是学术实验。
基本上,要识别音符,您需要知道它们从哪里开始、在哪里结束以及它们的音调是什么。 原则上,傅里叶变换是将时域(音频数据)转换为频域(音调)的最基本方法。 在实践中,乐器会产生大量和声(泛音),如果我们添加复调(许多 F0),那就会变得一团糟。
您可以尝试将音频数据的 50 毫秒连续片段输入到 FFT。 通过这种方式,您可以获得每个切片的光谱,然后检测每个切片中最强的峰值,并根据连续切片之间发生的情况推断节奏。
抱歉,我帮不上什么忙……但我只是想指出,你想做的事情非常困难,说真的。 也许您应该从更简单的事情开始,例如检测单音符正弦波旋律。 祝你好运!
Music transcription means creating music notation from sound (or audio data). While accomplished musicians and especially composers are able to do this, it's an extremely difficult task to do with a machine, and as far as i know, there has been little success so far - mostly academic experiments.
Basically, to recognize notes, you want to know where they start, where they end, and what is their pitch. Fourier transform is the most basic way to turn time domain (audio data) to frequency domain (pitches) - in principle. In practice, musical instruments generate lots of harmonics (overtones) and if we have polyphony (many F0s) added, it's a mess.
You could try feeding something like 50 millisecond sequential slices of the audio data to the FFT. This way you would get the spectrum of each slice, then detect the strongest peaks in each slice, and infer the rhythm from what happens between successive slices.
Sorry, I couldn't help much... But just wanted to point out that what you're trying to do is extremely difficult, seriously. Perhaps you should start from something simpler, like detecting one-note sine wave melodies. Good luck!
要检测和弦音乐中旋律的基频,您可以尝试使用 MELODIA vamp 插件(仅限非商业用途):http://mtg.upf.edu/technologies/melodia
如果您想自己实现旋律提取算法,您将必须查看当前最先进的算法研究,MIREX 旋律提取年度评估活动可能是一个不错的起点:http:// www.music-ir.org/mirex/wiki/Audio_Melody_Extraction
那,或者只是谷歌“旋律提取”;)
For detecting the fundamental frequency of the melody in polyphonic music you can try out the MELODIA vamp plug-in (non-commercial use only): http://mtg.upf.edu/technologies/melodia
If you want to implement a melody extraction algorithm yourself you're going to have to check out the current state-of-the-art in research, a good place to start might be the MIREX melody extraction annual evaluation campaign: http://www.music-ir.org/mirex/wiki/Audio_Melody_Extraction
That, or just google "melody extraction" ;)