加速稀疏 FFT 计算

发布于 2024-10-14 10:01:23 字数 1689 浏览 3 评论 0原文

我希望有人可以查看下面的代码并提供如何加快 tic 和 toc 之间部分的提示。下面的函数尝试比 Matlab 的内置函数更快地执行 IFFT,因为 (1) 几乎所有的 fft 系数 bin 都为零(即 101000 bin) 10M300M bin 的数量非零),并且 (2) 仅保留中间三分之一的 IFFT 结果(前三分之一和后三分之一被丢弃 - 所以不需要首先计算它们)。

输入变量为:

fftcoef = complex fft-coef 1D array (10 to 1000 pts long)
bins = index of fft coefficients corresponding to fftcoef (10 to 1000 pts long)
DATAn = # of pts in data before zero padding and fft (in range of 10M to 260M)
FFTn = DATAn + # of pts used to zero pad before taking fft (in range of 16M to 268M) (e.g. FFTn = 2^nextpow2(DATAn))

目前,此代码比 Matlab 的 ifft 函数方法长几个数量级,后者计算整个频谱,然后丢弃其中的 2/3。例如,如果 fftcoef 和 bins 的输入数据是 9x1 数组(即每个边带只有 9 个复数 fft 系数;考虑两者时为 18 点边带),以及 DATAn=32781534FFTn=33554432(即 2^25),则 ifft 方法需要 1.6 code> 秒,而下面的循环需要 700 秒。

我避免使用矩阵来矢量化 nn 循环,因为有时 fftcoef 和 bins 的数组大小可能长达 1000 点,而 260Mx1K 矩阵也太长对于内存来说很大,除非它可以以某种方式分解。

非常感谢任何建议!提前致谢。

function fn_fft_v1p0(fftcoef, bins, DATAn, FFTn)

fftcoef = [fftcoef; (conj(flipud(fftcoef)))];     % fft coefficients
bins = [bins; (FFTn - flipud(bins) +2)];          % corresponding fft indices for fftcoef array

ttrend = zeros( (round(2*DATAn/3) - round(DATAn/3) + 1), 1); % preallocate

start = round(DATAn/3)-1;

tic;
for nn = start+1 : round(2*DATAn/3)  % loop over desired time indices
  % sum over all fft indices having non-zero coefficients
  arg = 2*pi*(bins-1)*(nn-1)/FFTn;
  ttrend(nn-start) = sum( fftcoef.*( cos(arg) + 1j*sin(arg)); 
end
toc;

end

I'm hoping someone can review my code below and offer hints how to speed up the section between tic and toc. The function below attempts to perform an IFFT faster than Matlab's built-in function since (1) almost all of the fft-coefficient bins are zero (i.e. 10 to 1000 bins out of 10M to 300M bins are non-zero), and (2) only the central third of the IFFT results are retained (the first and last third are discarded -- so no need to compute them in the first place).

The input variables are:

fftcoef = complex fft-coef 1D array (10 to 1000 pts long)
bins = index of fft coefficients corresponding to fftcoef (10 to 1000 pts long)
DATAn = # of pts in data before zero padding and fft (in range of 10M to 260M)
FFTn = DATAn + # of pts used to zero pad before taking fft (in range of 16M to 268M) (e.g. FFTn = 2^nextpow2(DATAn))

Currently, this code takes a few orders of magnitude longer than Matlab's ifft function approach which computes the entire spectrum then discards 2/3's of it. For example, if the input data for fftcoef and bins are 9x1 arrays (i.e. only 9 complex fft coefficients per sideband; 18 pts when considering both sidebands), and DATAn=32781534, FFTn=33554432 (i.e. 2^25), then the ifft approach takes 1.6 seconds whereas the loop below takes over 700 seconds.

I've avoided using a matrix to vectorize the nn loop since sometimes the array size for fftcoef and bins could be up to 1000 pts long, and a 260Mx1K matrix would be too large for memory unless it could be broken up somehow.

Any advice is much appreciated! Thanks in advance.

function fn_fft_v1p0(fftcoef, bins, DATAn, FFTn)

fftcoef = [fftcoef; (conj(flipud(fftcoef)))];     % fft coefficients
bins = [bins; (FFTn - flipud(bins) +2)];          % corresponding fft indices for fftcoef array

ttrend = zeros( (round(2*DATAn/3) - round(DATAn/3) + 1), 1); % preallocate

start = round(DATAn/3)-1;

tic;
for nn = start+1 : round(2*DATAn/3)  % loop over desired time indices
  % sum over all fft indices having non-zero coefficients
  arg = 2*pi*(bins-1)*(nn-1)/FFTn;
  ttrend(nn-start) = sum( fftcoef.*( cos(arg) + 1j*sin(arg)); 
end
toc;

end

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

韵柒 2024-10-21 10:01:23

您必须记住,Matlab 使用编译的 fft 库(http://www.fftw.org/ )因其 fft 函数,除了运行速度比 Matlab 脚本快得多之外,它还针对许多用例进行了很好的优化。因此,第一步可能是用 c/c++ 编写代码并将其编译为可以在 Matlab 中使用的 mex 文件。这肯定会加速你的代码至少一个数量级(可能更多)。

除此之外,您可以做的一个简单优化是考虑两件事:

  1. 您假设您的时间序列是实值,因此您可以使用 fft 系数的对称性。
  2. 您的时间序列通常比您的 fft coeffs 向量长得多,因此最好迭代 bin 而不是时间点(从而矢量化较长的向量)。

这两点被转换为以下循环:

nn=(start+1 : round(2*DATAn/3))';
ttrend2 = zeros( (round(2*DATAn/3) - round(DATAn/3) + 1), 1);
tic;
for bn = 1:length(bins)
     arg = 2*pi*(bins(bn)-1)*(nn-1)/FFTn; 
     ttrend2 = ttrend2 +  2*real(fftcoef(bn) * exp(i*arg)); 
end
toc;

请注意,您必须在展开 binsfftcoef 之前使用此循环,因为对称性已经存在考虑到。使用您的问题中的参数运行此循环需要 8.3 秒,而在我的电脑上使用您的代码运行需要 141.3 秒。

You have to keep in mind that Matlab uses a compiled fft library (http://www.fftw.org/) for its fft functions, which besides operating much faster then a Matlab script, it is well optimized for many use-cases. So a first step might be writing your code in c/c++ and compiling it as a mex file you can use within Matlab. That will surely speed up your code at least an order of magnitude (probably more).

Besides that, one simple optimization you can do is by considering 2 things:

  1. You assume your time series is real valued, so you can use the symmetry of the fft coeffs.
  2. Your time series is typically much longer then your fft coeffs vector, so it is better to iterate over bins instead of time points (thus vectorizing the longer vector).

These two points are translated to the following loop:

nn=(start+1 : round(2*DATAn/3))';
ttrend2 = zeros( (round(2*DATAn/3) - round(DATAn/3) + 1), 1);
tic;
for bn = 1:length(bins)
     arg = 2*pi*(bins(bn)-1)*(nn-1)/FFTn; 
     ttrend2 = ttrend2 +  2*real(fftcoef(bn) * exp(i*arg)); 
end
toc;

Note you have to use this loop before you expand bins and fftcoef, since the symmetry is already taken into account. This loop takes 8.3 seconds to run with the parameters from your question, while it takes on my pc 141.3 seconds to run with your code.

病毒体 2024-10-21 10:01:23

我在 Accelerating FFTW pruning to避免大量零填充中发布了一个问题/答案 使用 FFTW 解决了 C++ 情况下的问题。您可以通过利用 mex-files 来使用此解决方案。

I have posted a question/answer at Accelerating FFTW pruning to avoid massive zero padding which solves the problem for the C++ case using FFTW. You can use this solution by exploiting mex-files.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文