我在 Android 手机上记录了来自麦克风的数据数组 [1024],将其传递给真实数据的 1D 前向 DFT(将另外 1024 位设置为 0)。我将数组保存到文本文件中,并重复此操作 8 次。
我得到了 16384 个结果。我在 Excel 中打开文本文件并制作了一个图表来查看它的样子(x = 数组的索引,y = 返回的数字的大小)。在 110、232 左右,存在一些幅度较大的峰值(正值和负值),并且小峰值以这种方式持续,直到 1817 年和 1941 年左右,峰值再次变大,然后再次下降。
我的问题是,无论我在哪里寻求有关该主题的帮助,它都会提到获取实数和虚数,我只有一个一维数组,这是我从 Piotr Wendykier 课堂上使用的方法中得到的:
DoubleFFT_1D.realForwardFull(audioDataArray); // from the library JTransforms.
我的问题是:我需要什么对该数据返回频率做什么?
录制的声音是我在吉他的底弦(第 5 品)上弹奏“A”(大约 440Hz)。
I have recorded an array[1024] of data from my mic on my Android phone, passed it through a 1D forward DFT of the real data (setting a further 1024 bits to 0). I saved the array to a text file, and repeated this 8 times.
I got back 16384 results. I opened the text file in Excel and made a graph to see what it looked like(x=index of array, y=size of number returned). There are some massive spikes (both positive and negative) in magnitude around 110, 232, and small spikes continuing in that fashion until around 1817 and 1941 where the spikes get big again, then drop again.
My problem is that wherever I look for help on the topic it mentions gettng the real and imaginary numbers, I only have a 1D array, that I got back from the method I used from Piotr Wendykier's class:
DoubleFFT_1D.realForwardFull(audioDataArray); // from the library JTransforms.
My question is: What do I need to do to this data to return a frequency?
The sound recorded was me playing an 'A' on the bottom string (5th fret) of my guitar (at roughly 440Hz) .
复数数据是交错的,实部在偶数索引处,虚部在奇数索引处,即实部在索引
2*i
处,虚部在索引2*i+处1.
.要获取索引 i 处的频谱幅度,您需要:
然后,您可以绘制 i = 0 到 N / 2 的幅度[i] 以获得功率谱。根据音频输入的性质,您应该会在频谱中看到一个或多个峰值。
要获得任何给定峰值的近似频率,您可以按如下方式转换峰值索引:
其中:
注意:如果您之前没有应用过合适的 "="">窗函数到时域输入数据,那么你将得到一定量的光谱泄漏并且功率谱看起来相当“模糊”。
为了进一步扩展这一点,这里是一个完整示例的伪代码,其中我们获取音频数据并识别最大峰值的频率:
The complex data is interleaved, with real components at even indices and imaginary components at odd indices, i.e. the real components are at index
2*i
, the imaginary components are at index2*i+1
.To get the magnitude of the spectrum at index i, you want:
Then you can plot magnitude[i] for i = 0 to N / 2 to get the power spectrum. Depending on the nature of your audio input you should see one or more peaks in the spectrum.
To get the approximate frequency of any given peak you can convert the index of the peak as follows:
where:
Note: if you have not previously applied a suitable window function to the time-domain input data then you will get a certain amount of spectral leakage and the power spectrum will look rather "smeared".
To expand on this further, here is pseudo-code for a complete example where we take audio data and identify the frequency of the largest peak: