如何从fft结果中获取频率?

发布于 12-08 15:03 字数 527 浏览 2 评论 0 原文

我在 Android 手机上记录了来自麦克风的数据数组 [1024],将其传递给真实数据的 1D 前向 DFT(将另外 1024 位设置为 0)。我将数组保存到文本文件中,并重复此操作 8 次。

我得到了 16384 个结果。我在 Excel 中打开文本文件并制作了一个图表来查看它的样子(x = 数组的索引,y = 返回的数字的大小)。在 110、232 左右,存在一些幅度较大的峰值(正值和负值),并且小峰值以这种方式持续,直到 1817 年和 1941 年左右,峰值再次变大,然后再次下降。

我的问题是,无论我在哪里寻求有关该主题的帮助,它都会提到获取实数和虚数,我只有一个一维数组,这是我从 Piotr Wendykier 课堂上使用的方法中得到的:

DoubleFFT_1D.realForwardFull(audioDataArray); // from the library JTransforms.

我的问题是:我需要什么对该数据返回频率做什么? 录制的声音是我在吉他的底弦(第 5 品)上弹奏“A”(大约 440Hz)。

I have recorded an array[1024] of data from my mic on my Android phone, passed it through a 1D forward DFT of the real data (setting a further 1024 bits to 0). I saved the array to a text file, and repeated this 8 times.

I got back 16384 results. I opened the text file in Excel and made a graph to see what it looked like(x=index of array, y=size of number returned). There are some massive spikes (both positive and negative) in magnitude around 110, 232, and small spikes continuing in that fashion until around 1817 and 1941 where the spikes get big again, then drop again.

My problem is that wherever I look for help on the topic it mentions gettng the real and imaginary numbers, I only have a 1D array, that I got back from the method I used from Piotr Wendykier's class:

DoubleFFT_1D.realForwardFull(audioDataArray); // from the library JTransforms.

My question is: What do I need to do to this data to return a frequency?
The sound recorded was me playing an 'A' on the bottom string (5th fret) of my guitar (at roughly 440Hz) .

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

抚笙 2024-12-15 15:03:21

复数数据是交错的,实部在偶数索引处,虚部在奇数索引处,即实部在索引2*i处,虚部在索引2*i+处1..

要获取索引 i 处的频谱幅度,您需要:

re = fft[2*i];
im = fft[2*i+1];
magnitude[i] = sqrt(re*re+im*im);

然后,您可以绘制 i = 0 到 N / 2 的幅度[i] 以获得功率谱。根据音频输入的性质,您应该会在频谱中看到一个或多个峰值。

要获得任何给定峰值的近似频率,您可以按如下方式转换峰值索引:

freq = i * Fs / N;

其中:

freq = frequency in Hz
i = index of peak
Fs = sample rate in Hz (e.g. 44100 Hz, or whatever you are using)
N = size of FFT (e.g. 1024 in your case)

注意:如果您之前没有应用过合适的 "="">窗函数到时域输入数据,那么你将得到一定量的光谱泄漏并且功率谱看起来相当“模糊”。


为了进一步扩展这一点,这里是一个完整示例的伪代码,其中我们获取音频数据并识别最大峰值的频率:

N = 1024          // size of FFT and sample window
Fs = 44100        // sample rate = 44.1 kHz
data[N]           // input PCM data buffer
fft[N * 2]        // FFT complex buffer (interleaved real/imag)
magnitude[N / 2]  // power spectrum

// capture audio in data[] buffer
// ...

// apply window function to data[]
// ...

// copy real input data to complex FFT buffer
for i = 0 to N - 1
  fft[2*i] = data[i]
  fft[2*i+1] = 0

// perform in-place complex-to-complex FFT on fft[] buffer
// ...

// calculate power spectrum (magnitude) values from fft[]
for i = 0 to N / 2 - 1
  re = fft[2*i]
  im = fft[2*i+1]
  magnitude[i] = sqrt(re*re+im*im)

// find largest peak in power spectrum
max_magnitude = -INF
max_index = -1
for i = 0 to N / 2 - 1
  if magnitude[i] > max_magnitude
    max_magnitude = magnitude[i]
    max_index = i

// convert index of largest peak to frequency
freq = max_index * Fs / N

The complex data is interleaved, with real components at even indices and imaginary components at odd indices, i.e. the real components are at index 2*i, the imaginary components are at index 2*i+1.

To get the magnitude of the spectrum at index i, you want:

re = fft[2*i];
im = fft[2*i+1];
magnitude[i] = sqrt(re*re+im*im);

Then you can plot magnitude[i] for i = 0 to N / 2 to get the power spectrum. Depending on the nature of your audio input you should see one or more peaks in the spectrum.

To get the approximate frequency of any given peak you can convert the index of the peak as follows:

freq = i * Fs / N;

where:

freq = frequency in Hz
i = index of peak
Fs = sample rate in Hz (e.g. 44100 Hz, or whatever you are using)
N = size of FFT (e.g. 1024 in your case)

Note: if you have not previously applied a suitable window function to the time-domain input data then you will get a certain amount of spectral leakage and the power spectrum will look rather "smeared".


To expand on this further, here is pseudo-code for a complete example where we take audio data and identify the frequency of the largest peak:

N = 1024          // size of FFT and sample window
Fs = 44100        // sample rate = 44.1 kHz
data[N]           // input PCM data buffer
fft[N * 2]        // FFT complex buffer (interleaved real/imag)
magnitude[N / 2]  // power spectrum

// capture audio in data[] buffer
// ...

// apply window function to data[]
// ...

// copy real input data to complex FFT buffer
for i = 0 to N - 1
  fft[2*i] = data[i]
  fft[2*i+1] = 0

// perform in-place complex-to-complex FFT on fft[] buffer
// ...

// calculate power spectrum (magnitude) values from fft[]
for i = 0 to N / 2 - 1
  re = fft[2*i]
  im = fft[2*i+1]
  magnitude[i] = sqrt(re*re+im*im)

// find largest peak in power spectrum
max_magnitude = -INF
max_index = -1
for i = 0 to N / 2 - 1
  if magnitude[i] > max_magnitude
    max_magnitude = magnitude[i]
    max_index = i

// convert index of largest peak to frequency
freq = max_index * Fs / N
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文