在 GPU 编程中使用条件时,工作项会执行什么?

发布于 2024-11-05 13:13:16 字数 268 浏览 7 评论 0原文

如果您有工作项在波前执行,并且有一个条件,例如:

  if(x){
        ...
  }
  else{
       ....
  }

工作项执行什么?波前中的所有工作项都将执行第一个分支(即 x == true )吗?如果没有 x 为假的工作项,那么条件的其余部分将被跳过?

如果一个工作项采用替代路径,会发生什么情况。我是否被告知所有工作项也将执行备用路径(因此执行两个路径?)。为什么会出现这种情况以及如何不扰乱程序执行

If you have work items executing in a wavefront and there is a conditional such as:

  if(x){
        ...
  }
  else{
       ....
  }

What do the work-items execute? is it the case whereby all workitems in the wavefront will execute the first branch (i.e. x == true). If there are no work-items for which x is false, then the rest of the conditional is skipped?

What happens if one work-item takes the alternative path. Am I told that all workitems will execute the alternate path as well (therefore executing both paths?). Why is this the case and how does it not mess up the program execution

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

触ぅ动初心 2024-11-12 13:13:16

NVIDIA GPU 使用条件执行来处理 SIMD 组内的分支分歧(“扭曲”)。在您的 if..else 示例中,两个分支均由发散扭曲中的每个线程执行,但那些不遵循给定分支的线程将被标记并执行改为空操作。这是经典的分支发散惩罚 - 扭曲间分支发散需要两次通过代码部分才能退出扭曲。这并不理想,这就是为什么面向性能的代码试图尽量减少这种情况。人们经常遇到的一件事是假设“首先”执行发散路径的哪一部分。由于二次猜测发散扭曲中的内部执行顺序,导致了一些非常微妙的错误。

对于更简单的条件,NVIDIA GPU 支持 ALU 处的条件评估,这不会导致发散,并且对于整个扭曲遵循相同路径的条件,显然也没有惩罚。

NVIDIA gpus use conditional execution to handle branch divergence within the SIMD group ("warp"). In your if..else example, both branches get executed by every thread in the diverging warp, but those threads which don't follow a given branch are flagged and perform a null op instead. This is the classic branch divergence penalty - interwarp branch divergence takes two passes through the code section to retire for warp. This isn't ideal, which is why performance oriented code tries to minimize this. One thing which often catches out people is making an assumption about which section of a divergent path gets executed "first". The have been some very subtle bugs cause by second guessing the internal order of execution within a divergent warp.

For simpler conditionals, NVIDIA GPUs support conditional evaluation at the ALU, which causes no divergence, and for conditionals where the whole warp follows the same path, there is also obviously no penalty.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文