实时进行 FFT

发布于 2024-11-19 12:28:05 字数 220 浏览 5 评论 0原文

我想实时对音频信号进行 FFT,即当人在麦克风中讲话时。我将获取数据(我使用 portaudio 来完成此操作,如果使用 wavein 会更容易,我会很乐意使用它 - 如果你能告诉我如何使用)。接下来我使用 FFTW 库 - 我知道如何执行 1D、2D(实数和复数)FFT,但我不太确定如何执行此操作,因为我必须执行 3D FFT 才能获取频率、幅度(这将确定颜色渐变)和时间。或者它只是一个 2D FFT,我得到幅度和频率?

I want to do the FFT of an audio signal in real time, meaning while the person is speaking in the microphone. I will fetch the data (I do this with portaudio, if it would be easier with wavein I would be happy to use that - if you can tell me how). Next I am using the FFTW library - I know how to perform 1D, 2D (real&complex) FFT, but I am not so sure how to do this, since I would have to do a 3D FFT to get frequency, amplitude (this would determine the color gradient) and time. Or is it just a 2D FFT, and I get amplitude and frequency?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

萌无敌 2024-11-26 12:28:05

我使用滑动 DFT,在每次样本到达输入缓冲区时都需要进行傅里叶变换的情况下,它比 FFT 快很多倍。

它基于这样一个事实:一旦您对最后 N 个样本执行了傅里叶变换,并且新样本到达,您就可以“撤消”最旧样本的效果,并应用最新样本的效果,以 >单次通过傅立叶数据!这意味着滑动 DFT 性能为 O(n),而 FFT 的性能为 O(Log2(n) 乘以 n)。此外,为了保持性能,缓冲区大小没有 2 的幂的限制。

下面的完整测试程序将滑动 DFT 与 fftw 进行了比较。在我的生产代码中,我已将以下代码优化为不可读,使其速度提高了三倍。

    #include <complex>
#include <iostream>
#include <time.h>
#include <math_defines.h>
#include <float.h>

#define DO_FFTW // libfftw
#define DO_SDFT

#if defined(DO_FFTW)
#pragma comment( lib, "d:\\projects\\common\\fftw\\libfftw3-3.lib" )

namespace fftw {
#include <fftw/fftw3.h>
}

fftw::fftw_plan plan_fwd;
fftw::fftw_plan plan_inv;

#endif

typedef std::complex<double> complex;

// Buffer size, make it a power of two if you want to improve fftw
const int N = 750;


// input signal
complex in[N];

// frequencies of input signal after ft
// Size increased by one because the optimized sdft code writes data to freqs[N]
complex freqs[N+1];

// output signal after inverse ft of freqs
complex out1[N];
complex out2[N];

// forward coeffs -2 PI e^iw -- normalized (divided by N)
complex coeffs[N];
// inverse coeffs 2 PI e^iw
complex icoeffs[N];

// global index for input and output signals
int idx;


// these are just there to optimize (get rid of index lookups in sdft)
complex oldest_data, newest_data;

//initilaize e-to-the-i-thetas for theta = 0..2PI in increments of 1/N
void init_coeffs()
{
    for (int i = 0; i < N; ++i) {
        double a = -2.0 * PI * i  / double(N);
        coeffs[i] = complex(cos(a)/* / N */, sin(a) /* / N */);
    }
    for (int i = 0; i < N; ++i) {
        double a = 2.0 * PI * i  / double(N);
        icoeffs[i] = complex(cos(a),sin(a));
    }
}


// initialize all data buffers
void init()
{
    // clear data
    for (int i = 0; i < N; ++i)
        in[i] = 0;
    // seed rand()
    srand(857);
    init_coeffs();
    oldest_data = newest_data = 0.0;
    idx = 0;
}

// simulating adding data to circular buffer
void add_data()
{
    oldest_data = in[idx];
    newest_data = in[idx] = complex(rand() / double(N));

}


// sliding dft
void sdft()
{
    complex delta = newest_data - oldest_data;
    int ci = 0;
    for (int i = 0; i < N; ++i) {
        freqs[i] += delta * coeffs[ci];
        if ((ci += idx) >= N)
            ci -= N;
    }
}

// sliding inverse dft
void isdft()
{
    complex delta = newest_data - oldest_data;
    int ci = 0;
    for (int i = 0; i < N; ++i) {
        freqs[i] += delta * icoeffs[ci];
        if ((ci += idx) >= N)
            ci -= N;
    }
}

// "textbook" slow dft, nested loops, O(N*N)
void ft()
{
    for (int i = 0; i < N; ++i) {
        freqs[i] = 0.0;
        for (int j = 0; j < N; ++j) {
            double a = -2.0 * PI * i * j / double(N);
            freqs[i] += in[j] * complex(cos(a),sin(a));
        }
    }
}

double mag(complex& c)
{
    return sqrt(c.real() * c.real() + c.imag() * c.imag());
}

void powr_spectrum(double *powr)
{
    for (int i = 0; i < N/2; ++i) {
        powr[i] = mag(freqs[i]);
    }

}

int main(int argc, char *argv[])
{
    const int NSAMPS = N*10;
    clock_t start, finish;

#if defined(DO_SDFT)
// ------------------------------ SDFT ---------------------------------------------
    init();

    start = clock();
    for (int i = 0; i < NSAMPS; ++i) {

        add_data();

        sdft();
        // Mess about with freqs[] here
        //isdft();

        if (++idx == N) idx = 0; // bump global index

        if ((i % 1000) == 0)
            std::cerr << i << " iters..." << '\r';
    }
    finish = clock();

    std::cout << "SDFT: " << NSAMPS / ((finish-start) / (double)CLOCKS_PER_SEC) << " fts per second." << std::endl;

    double powr1[N/2];
    powr_spectrum(powr1);
#endif

#if defined(DO_FFTW)

// ------------------------------ FFTW ---------------------------------------------
    plan_fwd = fftw::fftw_plan_dft_1d(N, (fftw::fftw_complex *)in, (fftw::fftw_complex *)freqs, FFTW_FORWARD, FFTW_MEASURE);
    plan_inv = fftw::fftw_plan_dft_1d(N, (fftw::fftw_complex *)freqs, (fftw::fftw_complex *)out2, FFTW_BACKWARD, FFTW_MEASURE);

    init();

    start = clock();
    for (int i = 0; i < NSAMPS; ++i) {

        add_data();

        fftw::fftw_execute(plan_fwd);
        // mess about with freqs here
        //fftw::fftw_execute(plan_inv);

        if (++idx == N) idx = 0; // bump global index

        if ((i % 1000) == 0)
            std::cerr << i << " iters..." << '\r';
    }
    // normalize fftw's output 
    for (int j = 0; j < N; ++j)
        out2[j] /= N;

    finish = clock();

    std::cout << "FFTW: " << NSAMPS / ((finish-start) / (double)CLOCKS_PER_SEC) << " fts per second." << std::endl;
    fftw::fftw_destroy_plan(plan_fwd);
    fftw::fftw_destroy_plan(plan_inv);

    double powr2[N/2];
    powr_spectrum(powr2);
#endif
#if defined(DO_SDFT) && defined(DO_FFTW)
// ------------------------------      ---------------------------------------------
    const double MAX_PERMISSIBLE_DIFF = 1e-11; // DBL_EPSILON;
    double diff;
    // check my ft gives same power spectrum as FFTW
    for (int i = 0; i < N/2; ++i)
        if ( (diff = abs(powr1[i] - powr2[i])) > MAX_PERMISSIBLE_DIFF)
            printf("Values differ by more than %g at index %d.  Diff = %g\n", MAX_PERMISSIBLE_DIFF, i, diff);

#endif
    return 0;
}

I use a Sliding DFT, which is many times faster than an FFT in the case where you need to do a fourier transform each time a sample arrives in the input buffer.

It's based on the fact that once you have performed a fourier transform for the last N samples, and a new sample arrives, you can "undo" the effect of the oldest sample, and apply the effect of the latest sample, in a single pass through the fourier data! This means that the sliding DFT performance is O(n) compared with O(Log2(n) times n) for the FFT. Also, there's no restriction to powers of two for the buffer size to maintain performance.

The complete test program below compares the sliding DFT with fftw. In my production code I've optimized the below code to unreadibility, to make it three times faster.

    #include <complex>
#include <iostream>
#include <time.h>
#include <math_defines.h>
#include <float.h>

#define DO_FFTW // libfftw
#define DO_SDFT

#if defined(DO_FFTW)
#pragma comment( lib, "d:\\projects\\common\\fftw\\libfftw3-3.lib" )

namespace fftw {
#include <fftw/fftw3.h>
}

fftw::fftw_plan plan_fwd;
fftw::fftw_plan plan_inv;

#endif

typedef std::complex<double> complex;

// Buffer size, make it a power of two if you want to improve fftw
const int N = 750;


// input signal
complex in[N];

// frequencies of input signal after ft
// Size increased by one because the optimized sdft code writes data to freqs[N]
complex freqs[N+1];

// output signal after inverse ft of freqs
complex out1[N];
complex out2[N];

// forward coeffs -2 PI e^iw -- normalized (divided by N)
complex coeffs[N];
// inverse coeffs 2 PI e^iw
complex icoeffs[N];

// global index for input and output signals
int idx;


// these are just there to optimize (get rid of index lookups in sdft)
complex oldest_data, newest_data;

//initilaize e-to-the-i-thetas for theta = 0..2PI in increments of 1/N
void init_coeffs()
{
    for (int i = 0; i < N; ++i) {
        double a = -2.0 * PI * i  / double(N);
        coeffs[i] = complex(cos(a)/* / N */, sin(a) /* / N */);
    }
    for (int i = 0; i < N; ++i) {
        double a = 2.0 * PI * i  / double(N);
        icoeffs[i] = complex(cos(a),sin(a));
    }
}


// initialize all data buffers
void init()
{
    // clear data
    for (int i = 0; i < N; ++i)
        in[i] = 0;
    // seed rand()
    srand(857);
    init_coeffs();
    oldest_data = newest_data = 0.0;
    idx = 0;
}

// simulating adding data to circular buffer
void add_data()
{
    oldest_data = in[idx];
    newest_data = in[idx] = complex(rand() / double(N));

}


// sliding dft
void sdft()
{
    complex delta = newest_data - oldest_data;
    int ci = 0;
    for (int i = 0; i < N; ++i) {
        freqs[i] += delta * coeffs[ci];
        if ((ci += idx) >= N)
            ci -= N;
    }
}

// sliding inverse dft
void isdft()
{
    complex delta = newest_data - oldest_data;
    int ci = 0;
    for (int i = 0; i < N; ++i) {
        freqs[i] += delta * icoeffs[ci];
        if ((ci += idx) >= N)
            ci -= N;
    }
}

// "textbook" slow dft, nested loops, O(N*N)
void ft()
{
    for (int i = 0; i < N; ++i) {
        freqs[i] = 0.0;
        for (int j = 0; j < N; ++j) {
            double a = -2.0 * PI * i * j / double(N);
            freqs[i] += in[j] * complex(cos(a),sin(a));
        }
    }
}

double mag(complex& c)
{
    return sqrt(c.real() * c.real() + c.imag() * c.imag());
}

void powr_spectrum(double *powr)
{
    for (int i = 0; i < N/2; ++i) {
        powr[i] = mag(freqs[i]);
    }

}

int main(int argc, char *argv[])
{
    const int NSAMPS = N*10;
    clock_t start, finish;

#if defined(DO_SDFT)
// ------------------------------ SDFT ---------------------------------------------
    init();

    start = clock();
    for (int i = 0; i < NSAMPS; ++i) {

        add_data();

        sdft();
        // Mess about with freqs[] here
        //isdft();

        if (++idx == N) idx = 0; // bump global index

        if ((i % 1000) == 0)
            std::cerr << i << " iters..." << '\r';
    }
    finish = clock();

    std::cout << "SDFT: " << NSAMPS / ((finish-start) / (double)CLOCKS_PER_SEC) << " fts per second." << std::endl;

    double powr1[N/2];
    powr_spectrum(powr1);
#endif

#if defined(DO_FFTW)

// ------------------------------ FFTW ---------------------------------------------
    plan_fwd = fftw::fftw_plan_dft_1d(N, (fftw::fftw_complex *)in, (fftw::fftw_complex *)freqs, FFTW_FORWARD, FFTW_MEASURE);
    plan_inv = fftw::fftw_plan_dft_1d(N, (fftw::fftw_complex *)freqs, (fftw::fftw_complex *)out2, FFTW_BACKWARD, FFTW_MEASURE);

    init();

    start = clock();
    for (int i = 0; i < NSAMPS; ++i) {

        add_data();

        fftw::fftw_execute(plan_fwd);
        // mess about with freqs here
        //fftw::fftw_execute(plan_inv);

        if (++idx == N) idx = 0; // bump global index

        if ((i % 1000) == 0)
            std::cerr << i << " iters..." << '\r';
    }
    // normalize fftw's output 
    for (int j = 0; j < N; ++j)
        out2[j] /= N;

    finish = clock();

    std::cout << "FFTW: " << NSAMPS / ((finish-start) / (double)CLOCKS_PER_SEC) << " fts per second." << std::endl;
    fftw::fftw_destroy_plan(plan_fwd);
    fftw::fftw_destroy_plan(plan_inv);

    double powr2[N/2];
    powr_spectrum(powr2);
#endif
#if defined(DO_SDFT) && defined(DO_FFTW)
// ------------------------------      ---------------------------------------------
    const double MAX_PERMISSIBLE_DIFF = 1e-11; // DBL_EPSILON;
    double diff;
    // check my ft gives same power spectrum as FFTW
    for (int i = 0; i < N/2; ++i)
        if ( (diff = abs(powr1[i] - powr2[i])) > MAX_PERMISSIBLE_DIFF)
            printf("Values differ by more than %g at index %d.  Diff = %g\n", MAX_PERMISSIBLE_DIFF, i, diff);

#endif
    return 0;
}
断桥再见 2024-11-26 12:28:05

如果您需要在一张图中包含幅度、频率和时间,则该变换称为时频分解。最流行的一种称为短时傅立叶变换。其工作原理如下:
1. 获取一小部分信号(例如 1 秒)
2. 用一个小窗口(例如 5 ms)对其进行窗口化
3. 计算加窗信号的一维傅立叶变换。
4. 稍微移动窗口(2.5 ms)
5. 重复上述步骤直至信号结束。
6. 所有这些数据都输入到一个矩阵中,然后用于创建信号的 3D 表示形式,显示信号沿频率、幅度和时间的分解。

窗口的长度将决定您在频域和时域中能够获得的分辨率。请在此处查看有关 STFT 的更多详细信息,并搜索“Robi Polikar”的小波教程以上变换为外行人的介绍。

编辑1:
你采用一个窗口函数(有无数的函数 - 这里有一个列表。大多数直观的是矩形窗口,但最常用的是汉明/汉宁窗函数,如果您手头有纸笔,则可以按照以下步骤进行绘制,

假设您获得的信号长度为 1 秒。窗口函数的长度为 5 毫秒,名称为 w[n]。窗口的 5ms 点与信号的 5ms 点重合)并将 x[n]w[n] 相乘,如下所示:
y[n] = x[n] * w[n] - 信号的逐点相乘。
y[n] 进行 FFT。

然后将窗口移动少量(例如 2.5 毫秒)。因此,现在信号 x[n] 的窗口从 2.5ms 延伸到 7.5ms。重复乘法和 FFT 生成步骤。换句话说,有 2.5 毫秒的重叠。您将看到,更改窗口的长度和重叠会在时间和频率轴上提供不同的分辨率。

完成此操作后,您需要将所有数据输入矩阵中,然后将其显示出来。重叠是为了最大限度地减少边界处可能出现的错误,并在如此短的时间范围内获得更一致的测量结果。

PS:如果您了解 STFT 和信号的其他时频分解,那么您应该对步骤 2 和 4 没有任何问题。您还没有理解上述步骤,这让我觉得您应该重新审视时频分解还。

If you need amplitude, frequency and time in one graph, then the transform is known as a Time-Frequency decomposition. The most popular one is called the Short Time Fourier Transform. It works as follows:
1. Take a small portion of the signal (say 1 second)
2. Window it with a small window (say 5 ms)
3. Compute the 1D fourier transform of the windowed signal.
4. Move the window by a small amount (2.5 ms)
5. Repeat above steps until end of signal.
6. All of this data is entered into a matrix that is then used to create the kind of 3D representation of the signal that shows its decomposition along frequency, amplitude and time.

The length of the window will decide the resolution you are able to obtain in frequency and time domains. Check here for more details on STFT and search for "Robi Polikar"'s tutorials on wavelet transforms for a layman's introduction to the above.

Edit 1:
You take a windowing function (there are innumerable functions out there - here is a list. Most intuitive is a rectangular window but the most commonly used are the Hamming/Hanning window functions. You can follow the steps below if you have a paper-pencil in hand and draw it along.

Assume that the signal that you have obtained is 1 sec long and is named x[n]. The windowing function is 5 msec long and is named w[n]. Place the window at the start of the signal (so the end of the window coincides with the 5ms point of the signal) and multiply the x[n] and w[n] like so:
y[n] = x[n] * w[n] - point by point multiplication of the signals.
Take an FFT of y[n].

Then you shift the window by a small amount (say 2.5 msec). So now the window stretches from 2.5ms to 7.5 ms of the signal x[n]. Repeat the multiplication and FFT generation steps. In other words, you have an overlap of 2.5 msec. You will see that changing the length of the window and the overlap gives you different resolutions on the time and Frequency axis.

Once you do this, you need to feed all the data into a matrix and then have it displayed. The overlap is for minimising the errors that might arise at boundaries and also to get more consistent measurements over such short time frames.

P.S: If you had understood STFT and other time-frequency decompositions of a signal, then you should have had no problems with steps 2 and 4. That you have not understood the above mentioned steps makes me feel like you should revisit time-frequency decompositions also.

烟─花易冷 2024-11-26 12:28:05

您可以通过选择较短的时间跨度并仅分析(FFT)该时间跨度来创建实时 FFT。您可能只需选择 100-500 毫秒的非重叠时间跨度即可逃脱惩罚;更纯粹的分析方法是使用滑动窗口(同样是 100-500 毫秒),但这通常是不必要的,您可以在不重叠的时间跨度上显示漂亮的图形,而无需太多的处理能力。

You can create a realtime FFT by choosing a short time-span and analysing (FFT'ing) just that time-span. You can probably get away with just selecting non-overlapping timespans of say 100-500 milliseconds; the analytically purer way to do this would be using a sliding-window (again of e.g. 100-500 ms), but that is often unnecessary and you can show nice graphics with the non-overlapping timespans without much processing power.

烟花易冷人易散 2024-11-26 12:28:05

实时FFT的含义与您刚才描述的完全不同。这意味着对于给定的 N 和 X[N],您的算法会给出 Fx[i],同时递增值 i。意思是,在当前值计算完成之前,不会计算当前值。这和你刚才描述的完全不同。

硬件通常使用大约 1k-16k 点的 FFT。固定N,不是实时计算。如先前答案所述移动窗口 FFT。

Real-time FFT means completely different from what you just described. It means that for given N and X[N] your algorithm gives Fx[i] while incrementing value i. Meaning, proceeding value does not compute until current value computation completed. This is completely different from what you just described.

Hardware usually uses FFT with around 1k-16k points. Fixed N, not real-time computation. Moving window FFT as described with previous answers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文