比较简单的声音 - 最接近的频率是多少
我有一个相当有趣的问题需要解决。
我想采用一种非常简单的声音(在钢琴上演奏的一个音符)并尝试以这样的方式处理它,以便我可以打印出最有可能演奏的音符。
通过一些谷歌搜索和搜索,我遇到了快速傅里叶变换,但不完全确定如何使用它来分析 wav 文件中的数据。
我的另一个想法是,每次演奏时,音符应该或多或少相同。如果是这种情况,将两个 wav 文件转换为字节数组的百分比匹配有什么用吗?
想法和想法将不胜感激。
I have a rather interesting problem to solve.
I want to take a very simple sound (one note played on the piano) and try to process it in such a way that I can print out which note is most likely being played.
From some googling and searching I have come across the fast fourier transform but am not entirely sure how I would use this to analyze data from a wav file.
Another thought I had was that a note should be more or less the same each time it is played. If that is the case could a percentage match on two wav files turned into byte arrays be of any use?
Thoughts and ideas would be much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
FFT 是比比较两个 WAV 更好的选择。 FFT 将产生频谱,并且由于钢琴产生相对纯净的音调,因此在绘制它时您会观察到非常明显的尖峰。每个尖峰的位置表示波形的组成频率之一,最大的尖峰代表音符。
The FFT is a much better option than comparing two WAVs. The FFT will produce a frequency spectrum, and since the piano produces a relatively pure tone, you will observe very distinct spikes when you plot it. The position of each spike denotes one of the constituent frequencies of the waveform, with the largest spike representing the note.
您应该分析正在演奏的音符的频率。我有点生疏,但我认为 FFT 应该这样做,因为它将波形分解为频谱。
您不想将 wav 文件与已存储的文件进行比较,因为周期幅度等可能不同。 “百分比匹配”会产生错误的结果。
一旦获得了波形的频率,您就可以设计正在演奏的音符。
You should be analyzing the frequency of the note being played. Im a bit rusty but FFT i think should do this as it breaks down the waveform into frequency spectrum.
You do not want to be comparing the wav file with an already stored one, as the period amplitude etc could be different. A 'percentage match' would produce erroneous results.
Once you have the frequency of the waveform,you can then devise the note that is being played.
我会开始阅读数字信号处理(DSP)和频谱分析。听起来您正在尝试找到钢琴音符的基本频率。
要使用 WAV 或其他文件格式进行任何有意义的工作,您需要提取和解释音频样本。如果不想手动执行此操作,我建议查看大量现有的 DSP 库。我不确定存在哪些好的 c# 库。
快速傅里叶变换 (FFT) 本质上会将时域功率转换为频域功率,本质上是为音频添加 z 轴。
I would start reading up on Digital Signal Processing (DSP) and spectral analysis. Sounds like you're trying to find the fundamental frequency of your piano note.
To do any meaningful work with a WAV or other file format, you'll need to extract and interpret the audio samples. If not wanting to do so by hand, I'd suggest looking into a myriad of existing DSP libraries. I'm not sure what good c# libraries exist.
The Fast Fourier Transform (FFT) will essentially convert your power over the time domain to the frequency domain, essentially adding a z-axis to your audio.