用于音调检测的 FFT
我最近一直在使用 FFT 进行音高检测,我注意到,虽然音符是正确的(例如 C、D# 等),但有很多音符处于错误的八度音程(例如 E2 被归类为 E3,C3 被归类为 E3)。归类为 C4,始终高八度)。
为什么会这样呢?我的算法是在计算 FFT bin 后,得到强度最大的 bin 并计算它的频率。
对此有什么帮助吗?谢谢!
Ive been recently using FFT for Pitch Detection and I notice that, although the notes are correct (e.g. C, D#, etc.), there are a lot of notes that are in the wrong octave (e.g. E2 is categorized as E3, C3 is categorized as C4, always an octave up).
Why is this the case? My algorithm is after calculating the FFT bins, I get the bin with the greatest intensity and calculate which frequency it is.
Any help on this? Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
两个想法:-
如果您的输入和算法始终与您的预期相差 1 个八度,那么您不能接受这样的校准并始终减去一个八度吗?
当您使用吉他弦时,您总是会得到一个谐波(第二谐波),该谐波恰好高一个八度,非常响亮 - 大约与自然声音(第一谐波)一样响亮。接下来你会得到 1 个八度 7 个半音(第 3 次谐波),但八度谐波确实很明显。
two thoughts :-
if your input and your algorithm are always exactly 1 octave apart from what you expect then can't you just accpet that you're calibrated like that and always subtract an octave?
when you take a guitar string you always get a harmonic (the 2nd harmonic) exactly one octave higher that is very loud - about as loud as the natural (the 1st harmonic). next you get 1 octave 7semitones above (3rd harmonic) but the octave harmonic is really noticeable.
对我来说听起来像是和声。格雷格提出的尖锐问题似乎是正确的。
如果这是真的,您可以尝试查找所有存储桶的统计中位数并找到最接近的,而不是查找统计模式(就像您当前所做的那样)。
中位数 如果您发现输出存在变化,您还可以进行时间平滑(随时间变化的平均值)。
我知道吉他调音师做了其中一些事情,但仍然会间歇性地出现错误。这是一件混乱的事情:)
说到实时采样,根据您的样本来源,需要考虑很多异常情况,这些异常可能会给您带来意想不到的结果:
这些将显示在您的数据中,但您可能听不到他们的声音。如果您尝试匹配多个音调或和弦,您的工作将更加复杂。
Sound like harmonics to me. Greg's pointed question seems to be on the right track.
If that is true, you could try finding the statistical median of all buckets and find the closest, rather than finding the statistical mode (as you are currently doing).
If you are seeing variation in your output, you could also do temporal smoothing (average over time).
I know that guitar tuners do several of these things, and still come up intermittently wrong. It's a messy business :)
Speaking of live sampling, depending on your sample source, there are a lot of anomalies to consider that could be giving you unexpected results:
These will show up in your data, but you likely won't be able to hear them. And if you're trying to match against multiple tones or chords, your job will be even more complicated.
在决定将音高置于哪个八度音阶时,尝试向每个存储桶添加以 3 倍频率出现的音频量的一部分(例如,向 440Hz 存储桶添加 1320Hz 存储桶的振幅的一小部分)。在大多数乐器上,A440 可能在 880Hz、1320Hz、1760Hz、2200Hz、2640Hz 等处有重要的分量。A880 可能有 880Hz、1760Hz 和 2640Hz,但不会有重要的 1320Hz 分量(也没有 2220Hz)事情)。因此,如果您的代码试图确定某个音符是 A440 还是 A880,那么查看三次谐波(或其他奇数谐波)可能会提供有用的线索。
In deciding which octave to place a pitch in, try adding to each bucket some fraction the amount of audio that is present at 3x the frequency (e.g. add to the 440Hz bucket a fraction of the amplitude of the 1320Hz bucket). On most intstruments, an A440 is likely to have significant components at 880Hz, 1320Hz, 1760Hz, 2200Hz, 2640Hz, etc. An A880 would likely have 880Hz, 1760Hz, and 2640Hz, but would not have a significant 1320Hz component (nor 2220Hz for that matter). So if your code is trying to decide whether a note is an A440 or an A880, looking at the third-harmonic bucket (or other odd harmonics) may provide a useful clue.
倍频程检测可能非常棘手,尤其是对于缺少基波谐波和/或其他谐波的复调信号。假设您正确检测“音调”而不仅仅是“谐波”(请参阅下面的维基百科链接),那么您可以使用我开发的八度检测算法。
为了对 PitchScope Player 进行音高检测,我决定采用 2 阶段算法,其工作原理如下:a) 首先检测音符的 ScalePitch——“ScalePitch”有 12 个可能的音高值:{ E、F、F#、 G、G#、A、A#、B、C、C#、D、D# }。确定音符的音阶音高和时间宽度后,b) 然后通过检查 4 个可能的八度候选音符的所有和声来计算该音符的八度(基音)。
我的音高检测应用程序 PitchScope Player 的完整 C++ 源代码和可执行文件位于 GitHub(下面的链接)上,您可以编译并单步执行它以查看我的 Octave 检测算法的工作原理。
您需要重点关注 FundCandidCalcer.cpp 文件中的 FundCandidCalcer::Calc_Best_Octave_Candidate() 函数,以查看 C++ 中的该算法。下图还给出了如何计算八度的粗略想法。
https://en.wikipedia.org/wiki/Transcription_(music)#Pitch_detection< /a>
https://github.com/CreativeDetectors/PitchScope_Player
下图演示了我开发的八度检测算法,用于选择正确的八度候选音符(即正确的基本音符),一旦 ScalePitch 为该注释已经确定。
Octave Detection can be very tricky, especially on a polyphonic signal where the fundamental harmonic and/or other harmonics are missing. Assuming that you are correctly detecting 'pitch' and not just 'harmonics' (see Wikipedia link below), then you could use an Octave Detection algorithm that I developed.
In order to do pitch detection for PitchScope Player, I decided on a 2 Stage Algorithm that works like this: a) First the ScalePitch of a note is detected -- 'ScalePitch' has 12 possible pitch values: { E, F, F#, G, G#, A, A#, B, C, C#, D, D# }. And after ScalePitch and Time-Width of a note is determined, b) then the Octave (fundamental) of that note is calculated by examining ALL the harmonics of 4 possible Octave-Candidate notes.
The complete C++ source code and executable for my pitch detection application, PitchScope Player, is on GitHub (link below), and you could compile and step through it to see how my Octave Detection Algorithm works.
You would want to focus on the function FundCandidCalcer::Calc_Best_Octave_Candidate() within in the file FundCandidCalcer.cpp to see that algorithm in C++. The diagram below also gives a rough idea how to calculate the Octave.
https://en.wikipedia.org/wiki/Transcription_(music)#Pitch_detection
https://github.com/CreativeDetectors/PitchScope_Player
The diagram below demonstrates the Octave Detection algorithm which I developed to pick the correct Octave-Candidate note (that is, the correct Fundamental), once the ScalePitch for that note has been determined.