fft 算法的基准测试方法
我目前正在开发一个具有自己的内部 fft(快速傅里叶变换)库的库,我想将其替换为 FFTW 。现在,其他开发人员有点担心它可能导致的性能问题。另外,速度方面最关键的部分是处理半复实数的一维卷积算法。 (我正在使用 fftw 的 fftw_plan_r2r_1d)。
此外,事情有点复杂,因为 fftw 内部根据变换的大小使用不同的算法。
我当前的想法是生成一堆不同长度的数据集。然后读入它们并在进行转换之前以预定方式修改每次迭代的数据集数组。
或者还有什么我应该知道的吗?
I'm currently working on a library that has its own internal fft (Fast Fourier Transform) library which I would like to replace with FFTW. Now, other developers are a bit concerned about the performance issues it might cause. Also the most critical part speed-wise is the 1D convolution algorithm which deals with half-complex reals. (I'm using fftw's fftw_plan_r2r_1d).
Also, things are a bit more complicated because internally fftw uses different algorithms depending on the size of the transform.
My current idea is to generate a bunch of different length datasets. Then read them in and modify the dataset array for each iteration in a predetermined way before doing the transformation.
Or is there anything else I should know?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
确保为每个测试用例生成最佳的 FFTW 计划。 PATIENT 和 EXHAUSTIVE 标志可以导致更快的计划,但它们可能需要大量时间才能实现。 (显然,您不应该在基准计时中包含这一时间,因为它是一次性的且可缓存。)
如果您只需要单精度输入/输出数据,则构建 FFTW 库的单精度版本 - 它们可能会快得多默认的双精度版本,对于信号处理和图像处理等大多数应用来说足够准确。
另外,在构建 FFTW 库时,请确保启用 SIMD(如果适合您的架构),例如 x86 上的 SSE 或 PowerPC 上的 AltiVec。
Make sure you generate an optimal plan for FFTW for each test case. The PATIENT and EXHAUSTIVE flags can result in faster plans, but they can take a significant amount of time to get there. (Obviously you shouldn't include this time in your benchmark timing as it's one off and cacheable.)
If you only need single precision input/output data then build the single precision version of the FFTW libraries - they can be quite a bit faster then the default double precision version and are plenty accurate enough for most appilcations in e.g. signal processing and image processing.
Also when building the FFTW libraries make sure you enable SIMD if appropriate for your architecture, e.g. SSE on x86 or AltiVec on PowerPC.
您还可以尝试我的混合基数 fft 例程,可以从我的主页。它包括一个速度和准确性的基准测试程序,您可以将其用于评估。
You can also try out my mixed radix fft routine, which is available from my homepage. It includes a benchmark test program for both speed and accuracy that you may use for your evaluation.