使用 nvcc 时 arch 和 code 选项的默认值是什么?

发布于 2024-10-11 12:41:20 字数 526 浏览 2 评论 0原文

编译 CUDA 代码时,您必须选择为哪种架构生成代码。 nvcc 提供了两个参数来指定该架构,基本上:

  • arch 指定虚拟架构,可以是 compute_10compute_11 所以
  • code 指定真实的架构,可以是 sm_10sm_11 等。

像这样的命令:

nvcc x.cu -arch=compute_13 -code=sm_13

将生成 'cubin ' 具有 1.3 计算能力的设备的代码。如果我错了,请纠正我。我想知道这两个参数的默认值是什么? 当没有指定 arch code 值时,nvcc使用哪种默认架构?< /强>

When compiling your CUDA code, you have to select for which architecture your code is being generated. nvcc provides two parameters to specify this architecture, basically:

  • arch specifies the virtual arquictecture, which can be compute_10, compute_11, etc.
  • code specifies the real architecture, which can be sm_10, sm_11, etc.

So a command like this:

nvcc x.cu -arch=compute_13 -code=sm_13

Will generate 'cubin' code for devices with 1.3 compute capability. Please correct me if I'm wrong. Which I would like to know is which are the default values for these two parameters? Which is the default architecture that nvcc uses when no value for arch or code is specified?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

帅气称霸 2024-10-18 12:41:20

好吧,我终于找到了默认值。我的错误是没有从头到尾阅读 NVCC 文档中有关 GPU 编译的整个章节。因此,

nvcc x.cu

相当于

nvcc x.cu –arch=compute_10 -code=sm_10,compute_10

这些是默认值。默认情况下会对虚拟架构 compute_10 执行编译,编译结果的 a.out 将包含 sm_10 的 CUBIN 代码> 真实架构,以及 compute_10 架构的 PTX 汇编代码,如果您的架构大于 sm_10,CUDA 驱动程序将“及时”重新编译该代码。

Ok, I've finally managed to discover the default values. My fault for not reading the whole chapter on GPU compilation in the NVCC documentation from the beginning to the very very end. So,

nvcc x.cu

is equivalent for

nvcc x.cu –arch=compute_10 -code=sm_10,compute_10

Those are the default values. The compilation is performed by default to the virtual architecture compute_10, and the a.out that results from the compilation will include the CUBIN code for the sm_10 real architecture, and the PTX assembly code for the compute_10 architecture, which will be recompiled 'just in time' by the CUDA driver if your architecture is greater than sm_10.

筱武穆 2024-10-18 12:41:20

我相信默认值为 compute_10,因为除非您明确指定这就是您想要的,否则您无法使用任何compute_13 功能。 (大概是CUDA工具包附带的NVCC文档指定了,但我在网上找不到链接)。

I believe the default is compute_10, as you cannot use any compute_13 features unless you specify explicitly that that's what you want. (Presumably the NVCC documentation that comes with the CUDA toolkit specifies, but I can't find a link online).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文