CUDA 线程分配
我已经阅读了 CUDA 编程指南,但无法理解如下所示的线程分配方法:
dim3 dimGrid( 2, 2, 1 );
dim3 dimBlock( 4, 2, 2 );
KernelFunction<<< dimGrid, dimBlock >>>(. . .);
可以解释一下如何针对上述条件分配线程吗?
I have gone through the CUDA programming guide and I cannot understand the thread allocation method shown below:
dim3 dimGrid( 2, 2, 1 );
dim3 dimBlock( 4, 2, 2 );
KernelFunction<<< dimGrid, dimBlock >>>(. . .);
Can some explain how threads are allocated for the above condition?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
思考网格和块的直观方法是将它们可视化:
您的
dimBlock( 4, 2, 2 )
表示每个块都有4 x 2 x 2 = 16
线程。您的
dimGrid( 2, 2, 1 )
表示网格有2 x 2 x 1 = 4
块。因此,您的内核在 4 个块的网格上启动,其中每个块有 16 个线程。总之,您的内核将使用
16 x 4 = 64
线程启动。An intuitive way to think about grid and block is to visualize them:
Your
dimBlock( 4, 2, 2 )
means that each block has4 x 2 x 2 = 16
threads.Your
dimGrid( 2, 2, 1 )
means that the grid has2 x 2 x 1 = 4
blocks.Thus, your kernel is launched on a grid of 4 blocks, where each block has 16 threads. To conclude, your kernel will be launched with
16 x 4 = 64
threads.