提高 Python 中的 FFT 性能
Python 中最快的 FFT 实现是什么?
看来 numpy.fft 和 scipy.fftpack 都基于 fftpack,而不是 FFTW。 fftpack 和 FFTW 一样快吗?使用多线程 FFT 或分布式 (MPI) FFT 怎么样?
What is the fastest FFT implementation in Python?
It seems numpy.fft and scipy.fftpack both are based on fftpack, and not FFTW. Is fftpack as fast as FFTW? What about using multithreaded FFT, or using distributed (MPI) FFT?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您当然可以使用 Cython 或其他允许您访问外部库的类似工具来包装您想要测试的任何 FFT 实现。
基于 GPU
如果您要测试 FFT 实现,您还可以查看基于 GPU 的代码(如果您可以访问适当的硬件)。有几个:reikna.fft、scikits.cuda。
基于 CPU
还有一个基于 CPU 的 python FFTW 包装器 pyFFTW。
(还有 pyFFTW3 ,但它不像 pyFFTW 那样积极维护,而且它不使用Python3(源))
我没有任何这些方面的经验。如果速度对您来说很重要,那么您可能需要为您的特定应用程序进行一些挖掘和基准测试不同的代码。
You could certainly wrap whatever FFT implementation that you wanted to test using Cython or other like-minded tools that allow you to access external libraries.
GPU-based
If you're going to test FFT implementations, you might also take a look at GPU-based codes (if you have access to the proper hardware). There are several: reikna.fft, scikits.cuda.
CPU-based
There's also a CPU based python FFTW wrapper pyFFTW.
(There is pyFFTW3 as well, but it is not so actively maintained as pyFFTW, and it does not work with Python3. (source))
I don't have experience with any of these. It's probably going to fall to you to do some digging around and benchmark different codes for your particular application if speed is important to you.
对于https://gist.github.com/fnielsen/99b981b9da34ae3d5035详细的测试,我发现 scipy.fftpack 表现良好与我通过 pyfftw 的简单应用程序相比
pyfftw.interfaces.scipy_fftpack
,长度对应于素数的数据除外。第一次调用 pyfftw.interfaces.scipy_fftpack.fft 似乎需要一些设置成本。第二次就更快了。 Numpy 和 scipy 的带有素数的 fftpack 对于我尝试的数据大小来说表现得非常糟糕。在这种情况下,CZT 更快。几个月前,Scipy 的 Github 上提出了一个关于该问题的问题,请参阅 https://github.com/scipy/scipy/问题/4288
For a test detailed at https://gist.github.com/fnielsen/99b981b9da34ae3d5035 I find that scipy.fftpack performs fine compared to my simple application of pyfftw via
pyfftw.interfaces.scipy_fftpack
, except for data with a length corresponding to a prime number.There seems to be some setup cost associated with evoking pyfftw.interfaces.scipy_fftpack.fft the first time. The second time it is faster. Numpy's and scipy's fftpack with a prime number performs terribly for the size of data I tried. CZT is faster in that case. Some months ago an issue was put up at Scipy's Github about the problem, see https://github.com/scipy/scipy/issues/4288
与 pyFFTW 库相比,pyFFTW3 包较差,至少在实现方面如此。由于它们都包装了 FFTW3 库,我想速度应该是相同的。
https://pypi.python.org/pypi/pyFFTW
The pyFFTW3 package is inferior compared to the pyFFTW library, at least implementation wise. Since they both wrap the FFTW3 library I guess speed should be the same.
https://pypi.python.org/pypi/pyFFTW
在我工作的地方,一些研究人员编译了这个 Fortran 库,它为特定问题设置和调用 FFTW。这个 Fortran 库(带有一些子例程的模块)需要来自我的 Python 程序的一些输入数据(2D 列表)。
我所做的是为包装 Fortran 库的 Python 创建一个小的 C 扩展,我基本上调用“init”来设置 FFTW 规划器,另一个函数来提供我的 2D 列表(数组)和一个“计算”函数。
创建 C 扩展是一项小任务,并且有很多针对该特定任务的优秀教程。
这种方法的好处是我们获得了速度……非常快的速度。唯一的缺点是在 C 扩展中,我们必须迭代 Python 列表,并将所有 Python 数据提取到内存缓冲区中。
Where I work some researchers have compiled this Fortran library which setups and calls the FFTW for a particular problem. This Fortran library (module with some subroutines) expect some input data (2D lists) from my Python program.
What I did was to create a little C-extension for Python wrapping the Fortran library, where I basically calls "init" to setup a FFTW planner, and another function to feed my 2D lists (arrays), and a "compute" function.
Creating a C-extensions is a small task, and there a lot of good tutorials out there for that particular task.
To good thing about this approach is that we get speed .. a lot of speed. The only drawback is in the C-extension where we must iterate over the Python list, and extract all the Python data into a memory buffer.
FFTW 站点 显示 fftpack 的运行速度约为 FFTW 的 1/3,但这是机械翻译的 Fortran 到 C 步骤,然后是 C 编译,我不知道 numpy/scipy 是否使用更直接的 Fortran 编译。如果性能对您来说至关重要,您可能会考虑将 FFTW 编译到 DLL/共享库中并使用 ctypes 来访问它,或者构建自定义 C 扩展。
The FFTW site shows fftpack running about 1/3 as fast as FFTW, but that's with a mechanically translated Fortran-to-C step followed by C compilation, and I don't know if numpy/scipy uses a more direct Fortran compilation. If performance is critical to you, you might consider compiling FFTW into a DLL/shared library and using ctypes to access it, or building a custom C extension.
FFTW3 似乎是最快的实现,并且包装精美。第一个答案中的 PyFFTW 绑定有效。以下是一些比较执行时间的代码:test_ffts.py
FFTW3 seems to be the fastest implementation available that's nicely wrapped. The PyFFTW bindings in the first answer work. Here's some code that compares execution times: test_ffts.py