标准化 FFT 数据以适应人类听觉

发布于 2024-07-18 11:31:11 字数 835 浏览 3 评论 0原文

典型的音频 FFT 看起来与此非常相似,大部分操作发生在最左侧

http://www.flight404.com/blog/images/fft.jpg

他将其乘以部分正弦波以得到底部,但是这篇文章对这部分内容并不太具体它。 这似乎也是对数据集的一种“足够好”的修改,而不是基于某些属性的修改。 我知道人类的听觉更适合较高的频率,因此,大多数音乐都会放大低音和衰减高音,这样两种声音对我们来说具有相对相等的强度。

我的问题是需要对 FFT 进行哪些修改才能补偿这种标准衰减?

for(i = 0; i < fft.length; i++){
     fft[i] = fft[i] * Math.log(i + 1); // does, eh, ok but the high
                                        // end is still not really "loud"
                                        // enough
}

编辑::

http://en.wikipedia.org/wiki/Equal-loudness_contour

我看到这篇文章,我认为这可能是前进的方向,但 FFT 的某些属性可能仍然需要抵消。

The typical FFT for audio looks pretty similar to this, with most of the action happening on the far left side

http://www.flight404.com/blog/images/fft.jpg

He multiplied it by a partial sine wave to get it to the bottom, but the article isn't too specific on this part of it. It also seems like a "good enough" modification of the dataset, rather than one based on some property. I understand that human hearing is better suited to the higher frequencies, thus, most music will have amplified bass and attenuated treble so that both sound to us as being of relatively equal strength.

My question is what modification needs to be done to the FFT to compensate for this standard falloff?

for(i = 0; i < fft.length; i++){
     fft[i] = fft[i] * Math.log(i + 1); // does, eh, ok but the high
                                        // end is still not really "loud"
                                        // enough
}

EDIT ::

http://en.wikipedia.org/wiki/Equal-loudness_contour

I came across this article, I think it might be the direction to head in, but there still might be some property of an FFT that needs to be counteracte.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

○闲身 2024-07-25 11:31:11

首先,您确定要这样做吗? 补偿一些事情是有意义的,例如麦克风响应不平坦,但不是人类的感知。 人们习惯于听到具有现实世界中声音频谱内容的声音,而不是沿着感知等响度曲线。 如果您按照您建议的方式播放已修改的声音,那么听起来会很奇怪。 也许有些人喜欢增强低频的音乐,但这是品味问题,而不是心理物理学问题。

或者,您可能出于其他原因进行补偿,例如,考虑到对较低频率的较差敏感度可能会增强压缩算法。 这是这个想法吗?

如果您确实想通过等响度曲线进行归一化,则应注意大多数曲线和方程都是以声压级 (SPL) 为单位的。 SPL 是波形幅度平方的对数,因此当您使用 FFT 时,使用其平方(功率谱)可能是最简单的。 (或者,当然,您可以通过其他方式进行补偿,例如在上面的等式中乘以 sqrt(log(i+1)) - 假设对数是逆等响度曲线的近似值。)

First, are you sure you want to do this? It makes sense to compensate for some things, like the microphone response not being flat, but not human perception. People are used to hearing sounds with the spectral content that the sounds have in the real world, not along perceptual equal loudness curves. If you play a sound that you've modified in the way you suggest it would sound strange. Maybe some people like the music to have enhanced low frequencies, but this is a matter of taste, not psychophysics.

Or maybe you are compensating for some other reason, for example, taking into account the poorer sensitivity to lower frequencies might enhance a compression algorithm. Is this the idea?

If you do want to normalize by the equal loudness curves, one should note that most of the curves and equations are in terms of sound pressure level (SPL). SPL is the log of the square of the waveform amplitude, so when you work with the FFTs, it's probably easiest to work with their square (the power specta). (Or, of course, you could compensate in other ways by, say, multiplying by sqrt(log(i+1)) in your equation above -- assuming that the log was an approximation of the inverse equal-loudness curve.)

猥琐帝 2024-07-25 11:31:11

我认为等响度轮廓正是正确的方向。
然而,其形状取决于绝对压力水平。
换句话说,我们听力的灵敏度曲线随着声压的变化而变化。

如果您没有有关绝对水平的信息,则不存在“正确的标准化”。
如果这是一个问题,取决于您想如何处理数据。

响度等值线已在 ISO 226 中标准化,但该文档无法免费下载。 不过它应该在一个像样的大学图书馆里。
这是 另一个来源
响度轮廓

I think the equal loudness contour is exactly the right direction.
However, its shape depends on the absolute pressure level.
In other words the sensitivity curve of our hearing changes with sound pressure.

There is no "correct normalization" if you have no information about absolute levels.
If this is a problem depends on what you want to do with the data.

The loudness contour is standardized in ISO 226 but this document is not freely available for download. It should be in a decent university library though.
Here is another source for
loudness contours

捶死心动 2024-07-25 11:31:11

那么您正在尝试提高高端频率的水平吗? 听起来像具有最小乘数的高通滤波器可能会起作用,这样您就不会过多地衰减低频信号。 拿起一本关于滤波器设计的好书,也许可以使用这个小程序

So you are trying to raise the level of the high end frequencies? Sounds like a high pass filter with a minimum multiplier might work, so that you don't attenuate the low frequency signals too much. Pick up a good book on filter design, maybe monkey around with this applet

兔小萌 2024-07-25 11:31:11

在第一个采样器的旧时代,这是在 MOTU Boost 人们之前:)它不是 FFT,而是简单的(我认为首先是 Fairlight 或 Roland)对原始或生成的时域信号进行归一化(如果您正在进行节拍切片) ,回收式); 你不能这样做吗? 或者只有在补偿抵消后才进行 FFT?

看起来像是一个两阶段的过程,否则我个人会按原样保留 FFT 来完成任务。

In the old days of first samplers, this is before MOTU Boost people :) it wasn't FFT but simple (Fairlight or Roland it first I think) Normalisation done on the original or resulting time-domain signal (if you are doing beat slicing, recycle-style); can't you do that? Or only go for the FFT after you compensate to counteract for it?

Seems like a two phase procedure otherwise, I'd personally leave FFT as is for the task..

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文