是否可以在多个 GPU 上运行 cuda 内核
这是一个相当简单的问题,但谷歌搜索似乎没有答案,所以。
我想知道的是,如果我有两个能够运行 cuda 的 GPU 卡(相同),我的内核可以跨越这些卡吗?或者它绑定到一张卡或另一张卡?即 cuda 提供了整套可用的 GPU 核心,或者只是运行它的卡上的核心。
如果是这样,为了实现它,我需要了解什么特别的东西吗? 除了 cuda sdk 之外,还有什么值得了解的例子吗?
目标语言当然是C/C++。
This is a fairly simple question but googling doesn't seem to have the answer, so.
What I want to know is if I have two gpu cards (identical) capable of running cuda, can my kernel span these cards? Or is it bound to one card or the other? I.e. is cuda presented with the entire set of available gpu cores, or just the ones on the card it is run on.
If so, is there anything special I need to know about in order to make it happen and are there any examples over and above the cuda sdk worth knowing about?
Target language is of course C/C++.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
单个 CUDA 内核启动绑定到单个 GPU。为了使用多个 GPU,需要启动多个内核。
cuda 设备运行时 API 专注于选择的设备。任何给定的内核启动都将在最近使用 选择的设备上启动
cudaSetDevice()
cuda 示例中给出了多 GPU 编程示例 带有 P2P 的简单多 GPU 和 简单多 GPU
A single CUDA kernel launch is bound to a single GPU. In order to use multiple GPUs, multiple kernel launches will be required.
The cuda device runtime API focuses on whichever device is selected. Any given kernel launch will be launched on whichever device was most recently selected using
cudaSetDevice()
Examples of multi-GPU programming are given in the cuda samples simple multi-gpu with P2P and simple multi-gpu