如何在 Cuda 中获取从 2D Real 到 Complex FFT 的所有数据
我正在尝试使用 CUFFT 进行 2D 实数到复数 FFT。
我意识到我会这样做并得到 W/2+1 复数值(W 是我的 H*W 矩阵的“宽度”)。
问题是 - 如果我想在变换后构建该矩阵的完整 H*W 版本怎么办 - 如何将 H*(w/2+1) 结果矩阵中的一些值复制回完整大小矩阵以获得正确位置的两个部分和 DC 值
谢谢
I am trying to do a 2D Real To Complex FFT using CUFFT.
I realize that I will do this and get W/2+1 complex values back (W being the "width" of my H*W matrix).
The question is - what if I want to build out a full H*W version of this matrix after the transform - how do I go about copying some values from the H*(w/2+1) result matrix back to a full size matrix to get both parts and the DC value in the right place
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我对 CUDA 不熟悉,所以在阅读我的回复时请考虑到这一点。不过,我总体上熟悉 FFT 和信号处理。
听起来好像您从 H(行)x W(列)矩阵开始,并且您正在进行 2D FFT,本质上对每一行进行 FFT,最终得到一个 H x W/2+1 矩阵。 W 范围的 FFT 返回 W 值,但 CUDA 函数仅返回 W/2+1,因为实际数据在频域中是偶数,因此负频率数据是多余的。
因此,如果您想重现缺失的 W/2-1 点,只需镜像正频率即可。例如,如果其中一行如下:
索引数据
0 12 + 我
1 5 + 2i
2 6
3 2 - 3i
...
0 索引是您的直流功率,1 索引是最低的正频率仓,依此类推。因此,您可以将最接近 DC 的负频率档设为 5+2i,下一个最接近的频率档为 6,依此类推。将这些值放在数组中的位置取决于您。我会按照 Matlab 的方式进行操作,在正频率数据之后使用负频率数据。
我希望这是有道理的。
I'm not familiar with CUDA, so take that into consideration when reading my response. I am familiar with FFTs and signal processing in general, though.
It sounds like you start out with an H (rows) x W (cols) matrix, and that you are doing a 2D FFT that essentially does an FFT on each row, and you end up with an H x W/2+1 matrix. A W-wide FFT returns W values, but the CUDA function only returns W/2+1 because real data is even in the frequency domain, so the negative frequency data is redundant.
So, if you want to reproduce the missing W/2-1 points, simply mirror the positive frequency. For instance, if one of the rows is as follows:
Index Data
0 12 + i
1 5 + 2i
2 6
3 2 - 3i
...
The 0 index is your DC power, the 1 index is the lowest positive frequency bin, and so forth. You would thus make your closest-to-DC negative frequency bin 5+2i, the next closest 6, and so on. Where you put those values in the array is up to you. I would do it the way Matlab does it, with the negative frequency data after the positive frequency data.
I hope that makes sense.
有两种方法可以实现这一点。您必须编写自己的内核才能实现其中任何一个。
1)您需要对获得的(一半)数据执行共轭才能找到另一半。
2)由于无论如何您都想要完整的结果,因此最好将输入数据从实数转换为复数(通过用 0 虚数填充)并执行复数到复数的转换。
从实践中我发现两种方式的速度没有太大差异。
There are two ways this can be acheived. You will have to write your own kernel to acheive either of this.
1) You will need to perform conjugate on the (half) data you get to find the other half.
2) Since you want full results anyway, it would be best if you convert the input data from real to complex (by padding with 0 imaginary) and performing the complex to complex transform.
From practice I have noticed that there is not much of a difference in speed either way.
实际上,我搜索了 nVidia 论坛,发现有人编写的内核可以满足我的要求。这就是我用的。如果您在 cuda 论坛中搜索“redundant results fft”或类似内容,您会找到它。
I actually searched the nVidia forums and found a kernel that someone had written that did just what I was asking. That is what I used. if you search the cuda forum for "redundant results fft" or similar you will find it.