Android 音频 FFT 使用 audiorecord 检索特定频率幅度

发布于 2024-11-03 07:22:18 字数 1146 浏览 1 评论 0原文

我目前正在尝试使用 Android 实现一些代码,以检测何时通过手机麦克风播放一些特定的音频范围。我已经使用 AudioRecord 类设置了该类:

int channel_config = AudioFormat.CHANNEL_CONFIGURATION_MONO;
int format = AudioFormat.ENCODING_PCM_16BIT;
int sampleSize = 8000;
int bufferSize = AudioRecord.getMinBufferSize(sampleSize, channel_config, format);
AudioRecord audioInput = new AudioRecord(AudioSource.MIC, sampleSize, channel_config, format, bufferSize);

然后读入音频:

short[] audioBuffer = new short[bufferSize];
audioInput.startRecording();
audioInput.read(audioBuffer, 0, bufferSize);

执行 FFT 是我遇到困难的地方,因为我在这方面的经验很少。我一直在尝试使用此类:

FFT in Java与之配套的复杂类

然后我发送以下值:

Complex[] fftTempArray = new Complex[bufferSize];
for (int i=0; i<bufferSize; i++)
{
    fftTempArray[i] = new Complex(audio[i], 0);
}
Complex[] fftArray = fft(fftTempArray);

这很容易让我误解这个类的工作原理,但是返回的值即使在安静的情况下,也不能代表一致的频率。有谁知道执行此任务的方法,或者我是否使问题过于复杂化,尝试仅获取少量频率范围而不是将其绘制为图形表示?

I am currently trying to implement some code using Android to detect when a number of specific audio frequency ranges are played through the phone's microphone. I have set up the class using the AudioRecord class:

int channel_config = AudioFormat.CHANNEL_CONFIGURATION_MONO;
int format = AudioFormat.ENCODING_PCM_16BIT;
int sampleSize = 8000;
int bufferSize = AudioRecord.getMinBufferSize(sampleSize, channel_config, format);
AudioRecord audioInput = new AudioRecord(AudioSource.MIC, sampleSize, channel_config, format, bufferSize);

The audio is then read in:

short[] audioBuffer = new short[bufferSize];
audioInput.startRecording();
audioInput.read(audioBuffer, 0, bufferSize);

Performing an FFT is where I become stuck, as I have very little experience in this area. I have been trying to use this class:

FFT in Java and Complex class to go with it

I am then sending the following values:

Complex[] fftTempArray = new Complex[bufferSize];
for (int i=0; i<bufferSize; i++)
{
    fftTempArray[i] = new Complex(audio[i], 0);
}
Complex[] fftArray = fft(fftTempArray);

This could easily be me misunderstanding how this class is meant to work, but the values returned jump all over the place and aren't representative of a consistent frequency even in silence. Is anyone aware of a way to perform this task, or am I overcomplicating matters to try and grab only a small number of frequency ranges rather than to draw it as a graphical representation?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

素手挽清风 2024-11-10 07:22:18

首先,您需要确保获得的结果正确转换为浮点/双精度。我不确定 Short[] 版本如何工作,但 byte[] 版本仅返回原始字节版本。然后需要将该字节数组正确转换为浮点数。转换代码应如下所示:

    double[] micBufferData = new double[<insert-proper-size>];
    final int bytesPerSample = 2; // As it is 16bit PCM
    final double amplification = 100.0; // choose a number as you like
    for (int index = 0, floatIndex = 0; index < bytesRecorded - bytesPerSample + 1; index += bytesPerSample, floatIndex++) {
        double sample = 0;
        for (int b = 0; b < bytesPerSample; b++) {
            int v = bufferData[index + b];
            if (b < bytesPerSample - 1 || bytesPerSample == 1) {
                v &= 0xFF;
            }
            sample += v << (b * 8);
        }
        double sample32 = amplification * (sample / 32768.0);
        micBufferData[floatIndex] = sample32;
    }

然后使用 micBufferData[] 创建输入复数数组。

获得结果后,使用结果中复数的大小。除了具有实际值的频率之外,大多数幅度应接近于零。

您需要采样频率将数组索引转换为这样的幅度到频率:

private double ComputeFrequency(int arrayIndex) {
    return ((1.0 * sampleRate) / (1.0 * fftOutWindowSize)) * arrayIndex;
}

First you need to ensure that the result you are getting is correctly converted to a float/double. I'm not sure how the short[] version works, but the byte[] version only returns the raw byte version. This byte array then needs to be properly converted to a floating point number. The code for the conversion should look something like this:

    double[] micBufferData = new double[<insert-proper-size>];
    final int bytesPerSample = 2; // As it is 16bit PCM
    final double amplification = 100.0; // choose a number as you like
    for (int index = 0, floatIndex = 0; index < bytesRecorded - bytesPerSample + 1; index += bytesPerSample, floatIndex++) {
        double sample = 0;
        for (int b = 0; b < bytesPerSample; b++) {
            int v = bufferData[index + b];
            if (b < bytesPerSample - 1 || bytesPerSample == 1) {
                v &= 0xFF;
            }
            sample += v << (b * 8);
        }
        double sample32 = amplification * (sample / 32768.0);
        micBufferData[floatIndex] = sample32;
    }

Then you use micBufferData[] to create your input complex array.

Once you get the results, use the magnitudes of the complex numbers in the results. Most of the magnitudes should be close to zero except the frequencies that have actual values.

You need the sampling frequency to convert the array indices to such magnitudes to frequencies:

private double ComputeFrequency(int arrayIndex) {
    return ((1.0 * sampleRate) / (1.0 * fftOutWindowSize)) * arrayIndex;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文