快速傅里叶变换(FFT)输入和输出来分析Java中音频文件的频率?

发布于 2024-11-19 10:03:11 字数 105 浏览 2 评论 0原文

我必须使用 FFT 来分析音频文件的频率。但我不知道输入和输出是什么。

如果我想绘制频谱的音频文件,是否必须使用一维、二维或三维数组?有人可以推荐我 J2ME 上的 FFT 库吗?

I have to use FFT to analyse the frequency of an audio file. But I don't know what the input and output is.

Do I have to use 1-dimension, 2-dimension or 3-dimension array if I want to draw the spectrum's audio file? And can someone suggest me library for FFT on J2ME?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

×纯※雪 2024-11-26 10:03:11

@thongcaoloi,

关于输入数据维度的简单答案是:您需要一维数据。现在我将解释这意味着什么。

因为您想要分析音频数据,所以您对离散傅立叶变换(DFT 或 FFT)的输入是一维实数序列,它表示音频信号随时间变化的电压,而您的音频文件是数字电压随时间变化的表示。

您的音频文件是通过以固定采样率(也称为采样频率)对连续音频信号的电压进行采样而生成的,对于 CD 品质的音频通常为 44.1 KHz。

但是您的数据文件可能以低得多的频率进行采样,因此在对数据进行 FFT 之前尝试找出数据的采样频率。

所以现在您必须从音频文件中提取各个样本。如果您的文件是立体声的,它将有两个单独的样本序列,一个用于右通道,一个用于左通道。如果文件是单声道的,则它将只有一个样本序列。

如果您的文件是立体声或任何其他多通道音频格式(例如 5.1 或 7.1),您可以单独对每个通道进行 FFT,也可以使用电压相加将任意数量的通道组合在一起。这取决于您,并且取决于您尝试如何处理 FFT 结果。

DFT 或 FFT 的输出是复数序列。每个复数都是由实部和虚部组成的对,通常显示为一对 (re,im)。

如果您想绘制音频文件的功率谱密度图(这是大多数人希望从 FFT 获得的结果),您可以使用前 N/2 绘制 20*log10( sqrt( re^2 + im^2 ) ) 图FFT 输出的复数,其中 N 是 FFT 输入样本的数量。

您可以尝试构建自己的频谱分析仪软件程序,但我建议使用已经构建和测试的软件程序。

这两款 FFT 频谱分析仪可立即给出结果,并具有内置 IFFT 合成功能,这意味着您可以对频域频谱数据进行傅里叶逆变换,以在时域中重建原始信号。

http://www.mathworks.com/help/techdoc/ref/fft.html

< a href="http://www.sooeet.com/math/fft.php">http://www.sooeet.com/math/fft.php

这个主题还有很多内容,并且一般而言,数字信号处理的主题,但这简短的介绍,应该可以帮助您入门。

@thongcaoloi,

The simple answer regarding the dimensionality of your input data is: you need 1D data. Now I'll explain what that means.

Because you want to analyze audio data, your input to the discrete Fourier transform (DFT or FFT), is a 1-dimensional sequence of real numbers, which represents the changing voltage of the audio signal over time, and your audio file is a digital representation of that changing voltage over time.

Your audio file was produced by sampling the voltage of a continuous audio signal at a fixed sampling rate (also known as the sampling frequency), typically 44.1 KHz for CD quality audio.

But your data file could have been sampled at a much lower frequency, so try to find out the sampling frequency of your data before you do an FFT on that data.

So now you have to extract the individual samples from your audio file. If your file is stereo, it will have two separate sample sequences, one for the right channel and one for the left channel. If the file is mono, it will have only one sample sequence.

If your file is stereo, or any other multi-channel audio format such as 5.1 or 7.1, you could FFT each channel separately, or you could combine any number of channels together using voltage addition. That's up to you, and depends on what you're trying to do with your FFT results.

The output of the DFT or FFT is a sequence of complex numbers. Each complex number is a pair consisting of a real-part and an imaginary-part, typically shown as a pair (re,im).

If you want to graph the power spectral density of your audio file, which is what most people want from the FFT, you'll graph 20*log10( sqrt( re^2 + im^2 ) ), using the first N/2 complex numbers of the FFT output, where N is the number of input samples to the FFT.

You can try to build your own spectrum analyzer software program, but I suggest using something that's already built and tested.

These two FFT spectrum analyzers give results instantly, and have built-in IFFT synthesis, meaning that you can inverse Fourier transform the frequency-domain spectral data to reconstruct the original signal in the time-domain.

http://www.mathworks.com/help/techdoc/ref/fft.html

http://www.sooeet.com/math/fft.php

There's a lot more to this topic, and to the subject of digital signal processing in general, but this brief introduction, should get you started.

爱殇璃 2024-11-26 10:03:11

从理论上讲,FFT将complex[N] =>映射为:复杂[N].但是,如果您的数据只是音频文件,那么您的输入将只是复数,没有虚数部分。因此,您将映射 real[N] =>complex[N]。然而,通过一点数学,你会发现输出的格式始终是output[i]==complex_conjugate(output[Ni])。因此,您实际上只需要查看前 N/2+1 个样本。此外,FFT 的复数输出可为您提供有关相位和幅度的信息。如果您关心的只是音频中某个频率的多少,您只需查看幅度,其计算公式为 square_root(imaginary^2+real^2),例如输出的每个元素。

当然,您需要查看您使用的任何库的文档,以了解哪个数组元素对应于第 N 个复数输出的实部,并且同样查找第 N 个复数输出的虚部。

In the theoretical sense, an FFT maps complex[N] => complex[N]. However, if your data is just an audio file, then your input will be simply complex numbers with no imaginary component. Thus you will map real[N] =>complex[N]. However, with a little math, you see that the format of the output will always be output[i]==complex_conjugate(output[N-i]). Thus you really only need to look at the first N/2+1 samples. Additionally, the complex output of the FFT gives you information about both phase and magnitude. If all you care about is how much of a certain frequency is in your audio, you only need to look at the magnitude, which can be calculated as square_root(imaginary^2+real^2), for each element of the output.

Of course, you'll need to look at the documentation of whatever library you use to understand which array element corresponds to the real part of the Nth complex output, and likewise to find the imaginary part of the Nth complex output.

奢华的一滴泪 2024-11-26 10:03:11

我记得FFT算法并没有那么复杂,我曾经为我的论文写过一个FFT计算类。此时输入是从 *.WAV 文件中读取的一维值数组。但在 FFT 之前,需要进行一些滤波和归一化。

As I remember FFT algorithm is not that complex, I used to write a Class of FFT calculation for my thesis. At that time the input is a 1D array of values which are read from the *.WAV files. But before FFT, there were some filtering and normalization performed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文