标准化 FFT 数据 (FFTW)
我一直在使用 FFTW 计算标准化 .wav 文件数据的 FFT。然而,我对如何标准化 FFT 输出有点困惑。我一直在使用对我来说显而易见的方法,即除以最高的 FFT 幅度。然而,我已经看到建议除以 1/N 和 N/2(其中我假设 N = FFT 大小)。这些如何作为标准化因素发挥作用?在我看来,这些因素与实际数据之间似乎没有直观的关系 - 那么我错过了什么?
非常感谢您对此提供的任何帮助。
Using FFTW I have been computing the FFT of normalized .wav file data. I am a bit confused as to how I should normalise the FFT output, however. I have been using the method which seemed obvious to me, which is simply to divide by the highest FFT magnitude. I have seen division by 1/N and N/2 recommended, however (where I assume N = FFT size). How do these work as normalisation factors? There doesn't seem to me to be an intuitive relation between these factors and the actual data - so what am I missing?
Huge thanks in advance for any help on this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
令人惊讶的是,FFT 和 IFFT 没有统一的定义,至少就缩放而言,但对于大多数实现(包括 FFTW),您需要在前向缩放 1/N,并且在相反的方向。
通常(出于性能原因)您会希望将此缩放因子与任何其他校正(例如 A/D 增益、窗口增益校正因子等)合并在一起,以便您只需将一个组合缩放因子应用于 FFT 输出垃圾箱。或者,如果您只是生成以 dB 为单位的功率谱,那么您可以对从功率谱箱中减去的单个 dB 值进行校正。
Surprisingly there is no single agreed definition for the FFT and the IFFT, at least as far as scaling is concerned, but for most implementations (including FFTW) you need to scale by 1/N in the forward direction, and there is no scaling in the reverse direction.
Usually (for performance reasons) you will want to lump this scaling factor in with any other corrections, such as your A/D gain, window gain correction factor, etc, so that you just have one combined scale factor to apply to your FFT output bins. Alternatively if you are just generating, say, a power spectrum in dB then you can make the correction a single dB value that you subtract from your power spectrum bins.
对于 FFT,参考 Parseval 定理以及其他需要有意义的量级的比较通常很有用。此外,任何单个峰的高度都不是很有用,并且取决于例如计算 FFT 时使用的窗口,因为这会缩短和加宽峰。出于这些原因,我建议不要按最大峰值进行标准化,因为这样您就会失去与有意义的幅度的任何简单连接,以及数据集之间的简单比较等。
It's often useful with FFTs to refer to Parseval's Theorem, and other comparisons that require a meaningful magnitude. Furthermore, the height of any individual peak isn't very useful, and depends, for example, on the window that used in calculating the FFT, as this can shorten and broaden the peak. For these reason, I'd recommend against normalizing by the largest peak, as you then lose any easy connection to meaningful magnitudes, and easy comparison between data sets, etc.