汉明窗有什么用?
我正在使用一些执行傅里叶变换的代码(以计算音频样本的倒谱)。在计算傅里叶变换之前,它会对样本应用汉明窗:
for(int i = 0; i < SEGMENTATION_LENGTH;i++){
timeDomain[i] = (float) (( 0.53836 - ( 0.46164 * Math.cos( TWOPI * (double)i / (double)( SEGMENTATION_LENGTH - 1 ) ) ) ) * frameBuffer[i]);
}
为什么要这样做?我找不到任何理由在代码中或在线执行此操作。
I'm working with some code that does a Fourier transform (to calculate the cepstrum of an audio sample). Before it computes the Fourier transform, it applies a Hamming window to the sample:
for(int i = 0; i < SEGMENTATION_LENGTH;i++){
timeDomain[i] = (float) (( 0.53836 - ( 0.46164 * Math.cos( TWOPI * (double)i / (double)( SEGMENTATION_LENGTH - 1 ) ) ) ) * frameBuffer[i]);
}
Why is it doing this? I can't find any reason for it to do this in the code, or online.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这是一个老问题,但我认为答案可以改进。
想象一下要进行傅里叶变换的信号是纯正弦波。在频域中,您会期望它仅在正弦频率处出现尖锐尖峰。但是,如果您采用傅立叶变换,您的漂亮尖峰将被如下所示的内容所取代:
这是为什么?真正的正弦波在两个方向上延伸至无穷大。计算机无法对无限数量的数据点进行计算,因此所有信号在两端都会被“切断”。这会导致您看到的峰值两侧出现波纹。汉明窗可以减少这种纹波,让您更准确地了解原始信号的频谱。
对于感兴趣的人来说,更多理论是:当您在两端切断信号时,您就隐式地将信号乘以方形窗口。方形窗口的傅立叶变换如上图所示,称为 sinc 函数。每当您在计算机上进行傅立叶变换时,无论您喜欢与否,您总是会选择某个窗口。方形窗口是隐式默认值,但不是一个很好的选择。人们已经提出了多种窗口,具体取决于您想要优化的某些特征。汉明窗是标准窗之一。
This is an old question, but I thought the answer could be improved.
Imagine the signal you want to fourier transform is a pure sine wave. In the frequency domain, you would expect it to have a sharp spike only at the frequency of the sine. However if you took the fourier transform, your nice sharp spike would be replaced by something like this:
Why is that? Real sine waves extend to infinity in both directions. Computers can't do computations with an infinite number of data points, so all signals are "cut off" at either end. This causes the ripple on either side of the peak that you see. The hamming window reduces this ripple, giving you a more accurate idea of the original signal's frequency spectrum.
More theory, for the interested: when you cut your signal off at either end, you are implicitly multiplying your signal by a square window. The fourier transform of a square window is the image above, known as a sinc function. Whenever you do a fourier transform on a computer, like it or not, you're always choosing some window. The square window is the implicit default, but not a very good choice. There are a variety of windows that people have come up with, depending on certain characteristics you want to optimize. The hamming window is one of the standard ones.
每当您进行有限傅里叶变换时,您都会隐式地将其应用于无限重复的信号。因此,例如,如果有限样本的开始和结束不匹配,那么这看起来就像信号中的不连续性,并在傅立叶变换中显示为大量高频无意义,而您不会这样做真的想要。如果你的样本恰好是一个漂亮的正弦曲线,但整数个周期并不恰好适合有限样本,那么你的 FT 将在远离真实频率的各种地方显示出可观的能量。你不想要这些。
对数据进行加窗可确保两端匹配,同时保持一切相当顺利;这大大减少了上一段中描述的“频谱泄漏”。
Whenever you do a finite Fourier transform, you're implicitly applying it to an infinitely repeating signal. So, for instance, if the start and end of your finite sample don't match then that will look just like a discontinuity in the signal, and show up as lots of high-frequency nonsense in the Fourier transform, which you don't really want. And if your sample happens to be a beautiful sinusoid but an integer number of periods don't happen to fit exactly into the finite sample, your FT will show appreciable energy in all sorts of places nowhere near the real frequency. You don't want any of that.
Windowing the data makes sure that the ends match up while keeping everything reasonably smooth; this greatly reduces the sort of "spectral leakage" described in the previous paragraph.
根据我对声音的了解和快速研究,汉明窗似乎可以最大限度地减少信号旁瓣(不需要的辐射)。从而提高声音的质量或谐波。
我也理解这种类型的窗口函数非常适合DTFT。
您可以在斯坦福研究人员页面或wikipedia 以及 哈里斯 如果你准备好数学了:D。
With what I know about sound and quick research, it appears that Hamming Window is here to minimize the signal side lobe (unwanted radiation). Thus improving the quality or harmonics of the sound.
I also understand this type of window function fits good with DTFT.
You will find some good technical explanation on a stanford researcher page or wikipedia and also in a paper of Harris if you are ready for maths :D.
正弦曲线有限长度段的 FT 将窗口的傅立叶变换与正弦曲线的频率峰值进行卷积,因为 FFT 的一个特性是一个域中的向量乘法是另一域中的卷积。矩形窗口的 FT(FFT 中任何未修改的有限长度样本所暗示的含义)是看起来杂乱的 Sinc 函数,它将窗口中不完全周期性的任何信号散布到整个频谱上。
汉明形窗口的 FT 将这种“飞溅”集中在卷积后更靠近频率峰值的位置(比 Sinc 函数),从而产生更宽但更平滑的频率峰值,但远离频率峰值的频率上的飞溅要少得多。这不仅会产生更清晰的频谱,而且会减少来自远处频率对任何感兴趣信号的干扰。
这种解释(与“无限重复”解释相对)更清楚地说明了为什么与汉明不同形状的窗口可以为您提供更好的结果,并且“泄漏”更少。特别是,汉明窗将减小紧邻频率峰值的“泄漏”的第一个 Sinc 旁瓣的大小,以换取实际上更多远离感兴趣频率的“泄漏”(或卷积飞溅)。如果您希望进行不同的权衡,其他窗口可能更合适。上面另一个答案中链接的哈里斯论文(pdf此处)给出了几个例子不同的窗口。
The FT of a finite length segment of sinusoid convolves the Fourier transform of the window against the sinusoid's frequency peak, since a property of the FFT is that vector multiplication in one domain is convolution in the other. The FT of a rectangular window (which is what any unmodified finite length of samples in an FFT implies) is the messy looking Sinc function which splatters any signal that is not exactly periodic in the window over the entire frequency spectrum.
The FT of a Hamming shaped window concentrates this "splatter" much nearer to the frequency peak after the convolution (than a Sinc function), resulting in a fatter but smoother frequency peak, but a lot less splatter across frequencies far from the frequency peak. This results in not only a cleaner looking spectrum, but also less interference from far away frequencies on any signal of interest.
This interpretation (as opposed to the "infinitely repeating" interpretation) makes it more clear why differently shaped windows than Hamming may give you better results with even less "leakage". In particular, a Hamming window will reduce the size of the first Sinc side lobe of "leakage" right next to the frequency peak in exchange for actually more "leakage" (or convolution splatter) far from the frequency of interest. Other windows may be more appropriate if you wish a different trade-off. The Harris paper (pdf here) linked in another answer above gives several examples of these different windows.