Android音频FFT显示基频

发布于 2024-12-18 21:38:23 字数 2890 浏览 2 评论 0 原文

我已经在一个 Android 项目上工作了一段时间,该项目显示输入信号的基频(充当调谐器)。我已经成功实现了 AudioRecord 类并正在从中获取数据。但是,我很难对此数据执行 FFT 来获取输入信号的基频。我一直在此处查看帖子 ,并且正在使用 Java 中的 FFT复杂类与之配合。

我已成功使用 Java 中的 FFT 中的 FFT 函数,但我不确定是否获得了正确的结果。对于 FFT 的幅度 (sqrt[rere+imim]),我得到的值一开始很高,大约 15000 Hz,然后慢慢减小到大约 300 Hz。似乎不对。

另外,就麦克风的原始数据而言,数据似乎很好,除了前 50 个值左右始终是数字 3,除非我仍在应用程序中再次按下调谐按钮,然后我只能得到大约15.这正常吗?

这是我的一些代码。

首先,我使用以下代码将短数据(从麦克风获得)转换为双精度数据,该代码来自 我一直在看的帖子。这段代码我不完全理解,但我认为它有效。

//Conversion from short to double
double[] micBufferData = new double[bufferSizeInBytes];//size may need to change
final int bytesPerSample = 2; // As it is 16bit PCM
final double amplification = 1.0; // choose a number as you like
for (int index = 0, floatIndex = 0; index < bufferSizeInBytes - bytesPerSample + 1; index += bytesPerSample, floatIndex++) {
    double sample = 0;
    for (int b = 0; b < bytesPerSample; b++) {
        int v = audioData[index + b];
        if (b < bytesPerSample - 1 || bytesPerSample == 1) {
            v &= 0xFF;
        }
        sample += v << (b * 8);
    }
    double sample32 = amplification * (sample / 32768.0);
    micBufferData[floatIndex] = sample32;
}

然后代码继续如下:

//Create Complex array for use in FFT
Complex[] fftTempArray = new Complex[bufferSizeInBytes];
for (int i=0; i<bufferSizeInBytes; i++)
{
    fftTempArray[i] = new Complex(micBufferData[i], 0);
}

//Obtain array of FFT data
final Complex[] fftArray = FFT.fft(fftTempArray);
final Complex[] fftInverse = FFT.ifft(fftTempArray);

//Create an array of magnitude of fftArray
double[] magnitude = new double[fftArray.length];
for (int i=0; i<fftArray.length; i++){
    magnitude[i]= fftArray[i].abs();
}


fft.setTextColor(Color.GREEN);
fft.setText("fftArray is "+ fftArray[500] +" and fftTempArray is "+fftTempArray[500] + " and fftInverse is "+fftInverse[500]+" and audioData is "+audioData[500]+ " and magnitude is "+ magnitude[1] + ", "+magnitude[500]+", "+magnitude[1000]+" Good job!");
for(int i = 2; i < samples; i++){
    fft.append(" " + magnitude[i] + " Hz");
}

最后一点只是检查我得到的值(并让我保持理智!)。在上面提到的帖子中,它谈到了需要采样频率并给出了这段代码:

private double ComputeFrequency(int arrayIndex) {
    return ((1.0 * sampleRate) / (1.0 * fftOutWindowSize)) * arrayIndex;
}

如何实现这段代码?我不太明白 fftOutWindowSize 和 arrayIndex 来自哪里?

非常感谢任何帮助!

达斯汀

I have been working on an Android project for awhile that displays the fundamental frequency of an input signal (to act as a tuner). I have successfully implemented the AudioRecord class and am getting data from it. However, I am having a hard time performing an FFT on this data to get the fundamental frequency of the input signal. I have been looking at the post here, and am using FFT in Java and Complex class to go with it.

I have successfully used the FFT function found in FFT in Java, but I am not sure if I am obtaining the correct results. For the magnitude of the FFT (sqrt[rere+imim]) I am getting values that start high, around 15000 Hz, and then slowly diminish to about 300 Hz. Doesn't seem right.

Also, as far as the raw data from the mic goes, the data seems fine, except that the first 50 values or so are always the number 3, unless I hit the tuning button again while still in the application and then I only get about 15. Is that normal?

Here is a bit of my code.

First of all, I convert the short data (obtained from the microphone) to a double using the following code which is from the post I have been looking at. This snippet of code I do not completely understand, but I think it works.

//Conversion from short to double
double[] micBufferData = new double[bufferSizeInBytes];//size may need to change
final int bytesPerSample = 2; // As it is 16bit PCM
final double amplification = 1.0; // choose a number as you like
for (int index = 0, floatIndex = 0; index < bufferSizeInBytes - bytesPerSample + 1; index += bytesPerSample, floatIndex++) {
    double sample = 0;
    for (int b = 0; b < bytesPerSample; b++) {
        int v = audioData[index + b];
        if (b < bytesPerSample - 1 || bytesPerSample == 1) {
            v &= 0xFF;
        }
        sample += v << (b * 8);
    }
    double sample32 = amplification * (sample / 32768.0);
    micBufferData[floatIndex] = sample32;
}

The code then continues as follows:

//Create Complex array for use in FFT
Complex[] fftTempArray = new Complex[bufferSizeInBytes];
for (int i=0; i<bufferSizeInBytes; i++)
{
    fftTempArray[i] = new Complex(micBufferData[i], 0);
}

//Obtain array of FFT data
final Complex[] fftArray = FFT.fft(fftTempArray);
final Complex[] fftInverse = FFT.ifft(fftTempArray);

//Create an array of magnitude of fftArray
double[] magnitude = new double[fftArray.length];
for (int i=0; i<fftArray.length; i++){
    magnitude[i]= fftArray[i].abs();
}


fft.setTextColor(Color.GREEN);
fft.setText("fftArray is "+ fftArray[500] +" and fftTempArray is "+fftTempArray[500] + " and fftInverse is "+fftInverse[500]+" and audioData is "+audioData[500]+ " and magnitude is "+ magnitude[1] + ", "+magnitude[500]+", "+magnitude[1000]+" Good job!");
for(int i = 2; i < samples; i++){
    fft.append(" " + magnitude[i] + " Hz");
}

That last bit is just to check what values I am getting (and to keep me sane!). In the post referred to above, it talks about needing the sampling frequency and gives this code:

private double ComputeFrequency(int arrayIndex) {
    return ((1.0 * sampleRate) / (1.0 * fftOutWindowSize)) * arrayIndex;
}

How do I implement this code? I don't realy understand where fftOutWindowSize and arrayIndex comes from?

Any help is greatly appreciated!

Dustin

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

巴黎盛开的樱花 2024-12-25 21:38:23

最近我正在做一个需要几乎相同的项目。也许你不再需要任何帮助,但无论如何我都会给出我的想法。也许将来有人需要这个。

  1. 我不确定短到双功能是否有效,我也不理解该代码片段。它是为字节到双精度转换而编写的。
  2. 在代码中: "double[] micBufferData = new double[bufferSizeInBytes];" 我认为 micBufferData 的大小应该是“bufferSizeInBytes / 2 ”,因为每个样本占用两个字节,并且 micBufferData 的大小应该是样本号。
  3. FFT 算法确实需要 FFT 窗口大小,并且它必须是 2 的幂的数字。然而,许多算法可以接收任意数字作为输入,然后它将完成其余的工作。在那些算法的文档中应该有输入的要求。在您的情况下,Complex 数组的大小可以是 FFT 算法的输入。我真的不知道FFT算法的细节,但我认为不需要逆算法。
  4. 要使用您最后给出的代码,您应该首先在样本数组中找到峰值索引。我使用 double 数组作为输入而不是 Complex,所以在我的例子中,它类似于: double maxVal = -1;int maxIndex = -1;

    for( int j=0; j < mFftSize / 2; ++j ) {
        双v = fftResult[2*j] * fftResult[2*j] + fftResult[2*j+1] * fftResult[2*j+1];
        if( v > maxVal ) {
            最大值=v;
            最大索引 = j;
        }
    }
    

    2*j 是实部,2*j+1 是虚部。 maxIndex 是您想要的峰值幅度的索引(更多详细信息此处),并将其用作 ComputeFrequency 函数的输入。返回值是您想要的样本数组的频率。

希望它可以帮助某人。

Recently I'm working on a project which requires almost the same. Probably you don't need any help anymore but I will give my thoughts anyway. Maybe someone need this in the future.

  1. I'm not sure whether the short to double function works, I don't understand that snippet of code neither. It is wrote for byte to double conversion.
  2. In the code: "double[] micBufferData = new double[bufferSizeInBytes];" I think the size of micBufferData should be "bufferSizeInBytes / 2", since every sample takes two bytes and the size of micBufferData should be the sample number.
  3. FFT algorithms do require a FFT window size, and it has to be a number which is the power of 2. However many algorithms can receive an arbitrary of number as input and it will do the rest. In the document of those algorithms should have the requirements of input. In your case, the size of the Complex array can be the input of FFT algorithms. And I don't really know the detail of the FFT algorithm but I think the inverse one is not needed.
  4. To use the code you gave at last, you should firstly find the peak index in the sample array. I used double array as input instead of Complex, so in my case it is something like: double maxVal = -1;int maxIndex = -1;

    for( int j=0; j < mFftSize / 2; ++j ) {
        double v = fftResult[2*j] * fftResult[2*j] + fftResult[2*j+1] * fftResult[2*j+1];
        if( v > maxVal ) {
            maxVal = v;
            maxIndex = j;
        }
    }
    

    2*j is the real part and 2*j+1 is the imaginary part. maxIndex is the index of the peak magnitude you want (More detail here), and use it as input to the ComputeFrequency function. The return value is the frequency of the sample array you want.

Hopefully it can help someone.

地狱即天堂 2024-12-25 21:38:23

您应该根据时间与频率分辨率要求选择 FFT 窗口大小,而不仅仅是在创建 FFT 临时数组时使用音频缓冲区大小。

数组索引是您的 int i,如您的模量 [i] 打印语句中所使用的那样。

音乐的基本音调频率通常与 FFT 峰值幅度不同,因此您可能需要研究一些音调估计算法。

You should pick an FFT window size depending on your time versus frequency resolution requirements, and not just use the audio buffer size when creating your FFT temp array.

The array index is your int i, as used in your magnitude[i] print statement.

The fundamental pitch frequency for music is often different from FFT peak magnitude, so you may want to research some pitch estimation algorithms.

梦中的蝴蝶 2024-12-25 21:38:23

我怀疑您得到的奇怪结果是因为您可能需要解压 FFT。如何完成此操作取决于您使用的库(请参阅 此处 有关如何将其打包到 GSL 中的文档)。堆积可能意味着实部和虚部不在数组中您期望的位置。

对于有关窗口大小和分辨率的其他问题,如果您正在创建调谐器,那么我建议尝试大约 20ms 的窗口大小(例如 44.1kHz 下的 1024 个样本)。对于调谐器,您需要相当高的分辨率,因此您可以尝试按 8 或 16 倍进行零填充,这将为您提供 3-6Hz 的分辨率。

I suspect that the strange results you're getting are because you might need to unpack the FFT. How this is done will depend on the library that you're using (see here for docs on how it's packed in GSL, for example). The packing may mean that the real and imaginary components are not in the positions in the array that you expect.

For your other questions about window size and resolution, if you're creating a tuner then I'd suggest trying a window size of about 20ms (eg 1024 samples at 44.1kHz). For a tuner you need quite high resolution, so you could try zero-padding by a factor of 8 or 16 which will give you a resolution of 3-6Hz.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文