当前位置：文江博客话题详情

VB FFT - 难以理解结果与频率的关系

发布于 2024-07-06 22:23:17 字数 655 浏览 7 评论 0原文

试图理解我正在使用（窃取）（回收）的 fft（快速傅里叶变换）例程，

输入是一个由 512 个数据点组成的数组，它们是样本波形。测试数据生成到该数组中。 fft 将该数组变换到频域。尝试理解频率、周期、采样率和 fft 数组中位置之间的关系。我将举例说明：

============================================

采样率为1000 个样本/秒。生成一组 10Hz 的样本。

输入数组的峰值位于 arr(28)、arr(128)、arr(228) ... period = 100 个样本点

fft 数组中的峰值位于索引 6 处（不包括 0 处的巨大值）

============================== ============

采样率为 8000 个样本/秒生成 440Hz 的样本

集输入数组峰值包括 arr(7)、arr(25)、arr(43)、arr(61) ... period = 18 个样本点

fft 数组中的峰值位于索引 29（不包括 0 处的巨大值）

============================== ============

如何将 fft 数组中峰值的索引与频率相关联？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一指流沙 2024-07-13 22:23:17

如果忽略虚部，频率分布在各个 bin 之间呈线性：

Frequency@i = (采样率/2)*(i/Nbins)。

因此，对于第一个示例，假设您有 256 个 bin，最大的 bin 对应的频率为 1000/2 * 6/256 = 11.7 Hz。
由于您的输入为 10Hz，我猜测 bin 5 (9.7Hz) 也有一个很大的分量。
为了获得更好的准确性，您需要采集更多样本，以获得更小的垃圾箱。

您的第二个示例给出 8000/2*29/256 = 453Hz。再次强调，关闭，但您需要更多垃圾箱。
你这里的分辨率只有4000/256 = 15.6Hz。

回复收藏 0 原文

浅忆 2024-07-13 22:23:17

如果您提供示例数据集，将会很有帮助。

我的猜测是你有所谓的采样工件。 DC（频率 0）处的强信号表明情况确实如此。

您应该始终确保输入数据中的平均值为零 - 在调用 fft 之前找到平均值并从每个样本点中减去它是一个很好的做法。

同样，您必须小心采样窗口伪影。重要的是第一个和最后一个数据点接近于零，否则从采样窗口外部到内部的“步骤”会产生以不同频率注入大量能量的效果。

最重要的是，进行 FFT 分析比简单地回收某处找到的 FFT 例程需要更加小心。

这是问题中描述的 10Hz 信号的前 100 个采样点，经过处理以避免采样伪影

> sinx[1:100]
  [1]  0.000000e+00  6.279052e-02  1.253332e-01  1.873813e-01  2.486899e-01  3.090170e-01  3.681246e-01  4.257793e-01  4.817537e-01  5.358268e-01
 [11]  5.877853e-01  6.374240e-01  6.845471e-01  7.289686e-01  7.705132e-01  8.090170e-01  8.443279e-01  8.763067e-01  9.048271e-01  9.297765e-01
 [21]  9.510565e-01  9.685832e-01  9.822873e-01  9.921147e-01  9.980267e-01  1.000000e+00  9.980267e-01  9.921147e-01  9.822873e-01  9.685832e-01
 [31]  9.510565e-01  9.297765e-01  9.048271e-01  8.763067e-01  8.443279e-01  8.090170e-01  7.705132e-01  7.289686e-01  6.845471e-01  6.374240e-01
 [41]  5.877853e-01  5.358268e-01  4.817537e-01  4.257793e-01  3.681246e-01  3.090170e-01  2.486899e-01  1.873813e-01  1.253332e-01  6.279052e-02
 [51] -2.542075e-15 -6.279052e-02 -1.253332e-01 -1.873813e-01 -2.486899e-01 -3.090170e-01 -3.681246e-01 -4.257793e-01 -4.817537e-01 -5.358268e-01
 [61] -5.877853e-01 -6.374240e-01 -6.845471e-01 -7.289686e-01 -7.705132e-01 -8.090170e-01 -8.443279e-01 -8.763067e-01 -9.048271e-01 -9.297765e-01
 [71] -9.510565e-01 -9.685832e-01 -9.822873e-01 -9.921147e-01 -9.980267e-01 -1.000000e+00 -9.980267e-01 -9.921147e-01 -9.822873e-01 -9.685832e-01
 [81] -9.510565e-01 -9.297765e-01 -9.048271e-01 -8.763067e-01 -8.443279e-01 -8.090170e-01 -7.705132e-01 -7.289686e-01 -6.845471e-01 -6.374240e-01
 [91] -5.877853e-01 -5.358268e-01 -4.817537e-01 -4.257793e-01 -3.681246e-01 -3.090170e-01 -2.486899e-01 -1.873813e-01 -1.253332e-01 -6.279052e-02

这是 fft 频域的最终绝对值

 [1] 7.160038e-13 1.008741e-01 2.080408e-01 3.291725e-01 4.753899e-01 6.653660e-01 9.352601e-01 1.368212e+00 2.211653e+00 4.691243e+00 5.001674e+02
[12] 5.293086e+00 2.742218e+00 1.891330e+00 1.462830e+00 1.203175e+00 1.028079e+00 9.014559e-01 8.052577e-01 7.294489e-01

It would be helpful if you were to provide your sample dataset.

My guess would be that you have what are called sampling artifacts. The strong signal at DC ( frequency 0 ) suggests that this is the case.

You should always ensure that the average value in your input data is zero - find the average and subtract it from each sample point before invoking the fft is good practice.

Along the same lines, you have to be careful about the sampling window artifact. It is important that the first and last data point are close to zero because otherwise the "step" from outside to inside the sampling window has the effect of injecting a whole lot of energy at different frequencies.

The bottom line is that doing an fft analysis requires more care than simply recycling a fft routine found somewhere.

Here are the first 100 sample points of a 10Hz signal as described in the question, massaged to avoid sampling artifacts

> sinx[1:100]
  [1]  0.000000e+00  6.279052e-02  1.253332e-01  1.873813e-01  2.486899e-01  3.090170e-01  3.681246e-01  4.257793e-01  4.817537e-01  5.358268e-01
 [11]  5.877853e-01  6.374240e-01  6.845471e-01  7.289686e-01  7.705132e-01  8.090170e-01  8.443279e-01  8.763067e-01  9.048271e-01  9.297765e-01
 [21]  9.510565e-01  9.685832e-01  9.822873e-01  9.921147e-01  9.980267e-01  1.000000e+00  9.980267e-01  9.921147e-01  9.822873e-01  9.685832e-01
 [31]  9.510565e-01  9.297765e-01  9.048271e-01  8.763067e-01  8.443279e-01  8.090170e-01  7.705132e-01  7.289686e-01  6.845471e-01  6.374240e-01
 [41]  5.877853e-01  5.358268e-01  4.817537e-01  4.257793e-01  3.681246e-01  3.090170e-01  2.486899e-01  1.873813e-01  1.253332e-01  6.279052e-02
 [51] -2.542075e-15 -6.279052e-02 -1.253332e-01 -1.873813e-01 -2.486899e-01 -3.090170e-01 -3.681246e-01 -4.257793e-01 -4.817537e-01 -5.358268e-01
 [61] -5.877853e-01 -6.374240e-01 -6.845471e-01 -7.289686e-01 -7.705132e-01 -8.090170e-01 -8.443279e-01 -8.763067e-01 -9.048271e-01 -9.297765e-01
 [71] -9.510565e-01 -9.685832e-01 -9.822873e-01 -9.921147e-01 -9.980267e-01 -1.000000e+00 -9.980267e-01 -9.921147e-01 -9.822873e-01 -9.685832e-01
 [81] -9.510565e-01 -9.297765e-01 -9.048271e-01 -8.763067e-01 -8.443279e-01 -8.090170e-01 -7.705132e-01 -7.289686e-01 -6.845471e-01 -6.374240e-01
 [91] -5.877853e-01 -5.358268e-01 -4.817537e-01 -4.257793e-01 -3.681246e-01 -3.090170e-01 -2.486899e-01 -1.873813e-01 -1.253332e-01 -6.279052e-02

And here is the resulting absolute values of the fft frequency domain

 [1] 7.160038e-13 1.008741e-01 2.080408e-01 3.291725e-01 4.753899e-01 6.653660e-01 9.352601e-01 1.368212e+00 2.211653e+00 4.691243e+00 5.001674e+02
[12] 5.293086e+00 2.742218e+00 1.891330e+00 1.462830e+00 1.203175e+00 1.028079e+00 9.014559e-01 8.052577e-01 7.294489e-01

回复收藏 0 原文

緦唸λ蓇 2024-07-13 22:23:17

我对数学和信号处理也有点生疏，但有了额外的信息，我可以尝试一下。

如果您想知道每个 bin 的信号能量，您需要复数输出的幅度。因此，仅查看实际输出是不够的。即使输入只是实数。对于每个 bin，输出的幅度为 sqrt(real^2 + imag^2)，就像毕达哥拉斯:-)

bin 0 到 449 是从 0 Hz 到 500 Hz 的正频率。 bin 500 到 1000 是负频率，应该与真实信号的正频率相同。如果每秒处理一个缓冲区，则频率和数组索引会很好地对齐。所以索引 6 处的峰值对应于 6Hz，所以这有点奇怪。这可能是因为您只查看实际输出数据，而实际数据和虚数数据结合起来给出索引 10 处的预期峰值。频率应线性映射到箱。

0 处的峰值表示 DC 偏移。

回复收藏 0 原文

甲如呢乙后呢 2024-07-13 22:23:17

我已经有一段时间没有做过 FFT 了，但我记得

FFT 通常采用复数作为输入和输出。所以我不太确定输入和输出的实部和虚部如何映射到数组。

我真的不明白你在做什么。在第一个示例中，您说您以 10Hz 处理样本缓冲区，采样率为 1000 Hz？因此，每秒应该有 10 个缓冲区，每个缓冲区有 100 个样本。我不明白你的输入数组如何至少有 228 个样本长。

通常输出缓冲器的前半部分是从 0 频率（=直流偏移）到 1/2 采样率的频率区间。后半部分是负频率。如果您的输入只是实数数据，虚数信号为 0，则正频率和负频率相同。输出上的实部/虚部信号的关系包含输入信号的相位信息。

回复收藏 0 原文

捂风挽笑 2024-07-13 22:23:17

bin i 的频率为 i * (采样率 / n)，其中 n 是 FFT 输入窗口中的样本数。

如果您正在处理音频，由于音调与频率的对数成正比，因此箱的音调分辨率会随着频率的增加而增加 - 很难准确地解析低频信号。为此，您需要使用更大的 FFT 窗口，这会降低时间分辨率。对于给定的采样率，需要权衡频率与时间分辨率。

您提到了一个值为 0 的较大值的 bin——这是频率为 0 的 bin，即直流分量。如果这个值很大，那么你的值大概是正的。 Bin n/2（在您的情况下为256）是奈奎斯特频率，采样率的一半，这是以此速率在采样信号中可以解析的最高频率。

如果信号是实数，则 bin n/2+1 到 n-1 将分别包含 bin n/2-1 到 1 的复共轭。 DC 值仅出现一次。

回复收藏 0 原文

我的奇迹 2024-07-13 22:23:17

正如其他人所说，样本在频域中是等间隔的（不是对数的）。

例如 1，您应该得到：

替代文本 http://home.comcast。 net/~kootsoop/images/SINE1.jpg

对于另一个示例，您应该得到

alt text http://home.comcast.net/~kootsoop/images/SINE2.jpg

因此，关于峰值位置，您的答案似乎都是正确的。

我没有得到的是大的直流分量。您确定要生成正弦波作为输入吗？输入是否变为负值？对于正弦波，只要获得足够的周期，直流电应该接近于零。

回复收藏 0 原文

我乃一代侩神 2024-07-13 22:23:17

另一种途径是为您正在寻找的每个音符中心频率制定一个Goertzel 算法。

一旦您获得了一种有效的算法实现，您就可以使其使用参数来设置其中心频率。这样您就可以轻松运行其中的 88 个或集合中您需要的任何内容并扫描峰值。

Goertzel 算法基本上是单箱 FFT。使用这种方法，您可以按照音符的自然走向对数放置垃圾箱。

来自维基百科的一些伪代码：

s_prev = 0
s_prev2 = 0
coeff = 2*cos(2*PI*normalized_frequency);
for each sample, x[n],
  s = x[n] + coeff*s_prev - s_prev2;
  s_prev2 = s_prev;
  s_prev = s;
end
power = s_prev2*s_prev2 + s_prev*s_prev - coeff*s_prev2*s_prev;

代表前两个样本的两个变量被保留用于下一次迭代。然后可以在流应用程序中使用它。我认为也许功率计算也应该在循环内。（但是 Wiki 文章中并未这样描述。）

在音调检测情况下，将有 88 个不同的系数、88 对先前样本，并会产生 88 个功率输出样本，指示该频率仓中的相对电平。

Another avenue is to craft a Goertzel's Algorithm of each note center frequency you are looking for.

Once you get one implementation of the algorithm working you can make it such that it takes parameters to set it's center frequency. With that you could easily run 88 of them or what ever you need in a collection and scan for the peak value.

The Goertzel Algorithm is basically a single bin FFT. Using this method you can place your bins logarithmically as musical notes naturally go.

Some pseudo code from Wikipedia:

s_prev = 0
s_prev2 = 0
coeff = 2*cos(2*PI*normalized_frequency);
for each sample, x[n],
  s = x[n] + coeff*s_prev - s_prev2;
  s_prev2 = s_prev;
  s_prev = s;
end
power = s_prev2*s_prev2 + s_prev*s_prev - coeff*s_prev2*s_prev;

The two variables representing the previous two samples are maintained for the next iteration. This can be then used in a streaming application. I thinks perhaps the power calculation should be inside the loop as well. (However it is not depicted as such in the Wiki article.)

In the tone detection case there would be 88 different coeficients, 88 pairs of previous samples and would result in 88 power output samples indicating the relative level in that frequency bin.

回复收藏 0 原文