Cuda 用 2D 块替换 double

发布于 2024-11-08 13:45:24 字数 799 浏览 0 评论 0原文

我对 CUDA 很陌生，一直在尝试遍历 2D 数组。我有以下代码，可以在普通 C 上按预期工作：

for (ty=0;ty<s;ty++){
        if (ty+pixY < s && ty+pixY>=0){
            for(tx=0;tx<r;tx++){
                T[ty/3][tx/3] += (tx+pixX<s && tx+pixX>=0) ? 
                *(image +M*(ty+pixY)+tx+pixX) * *(filter+fw*(ty%3)+tx%3) : 0;
            }
        }
    }

也许我遇到了问题，但是这段代码不会转换为 CUDA 吗？

tx = threadIdx.x;
ty = threadIdy.y;

T[ty/3][tx/3] += (tx+pixX<s && tx+pixX>=0) ?
                *(image +M*(ty+pixY)+tx+pixX) * *(filter+fw*(ty%3)+tx%3) : 0;

假设我已将内核参数定义为 dimGrid(1,1,1) 和 blockDim(r,s,1)

我之所以这样问，是因为我得到了意外的结果。另外，如果我正确地将数组声明并分配为 2D cuda 数组，而不仅仅是一个大的 1D 数组，这会有帮助吗？

感谢您的帮助。

原文

I'm really new to CUDA and have been trying to traverse a 2D array. I have the following code which works as expected on plain C:

for (ty=0;ty<s;ty++){
        if (ty+pixY < s && ty+pixY>=0){
            for(tx=0;tx<r;tx++){
                T[ty/3][tx/3] += (tx+pixX<s && tx+pixX>=0) ? 
                *(image +M*(ty+pixY)+tx+pixX) * *(filter+fw*(ty%3)+tx%3) : 0;
            }
        }
    }

Maybe I'm getting something wrong but wouldn't this code translate to CUDA as following?

tx = threadIdx.x;
ty = threadIdy.y;

T[ty/3][tx/3] += (tx+pixX<s && tx+pixX>=0) ?
                *(image +M*(ty+pixY)+tx+pixX) * *(filter+fw*(ty%3)+tx%3) : 0;

provided I have defined my kernel parameters as dimGrid(1,1,1) and blockDim(r,s,1)

I ask because I'm getting unexpected results. Also if I properly declare and allocate my arrays as 2D cuda arrays instead of just a big 1D array will this help?

Thanks for your help.

分享到QQ

分享到微博