使用 nvcc 时 arch 和 code 选项的默认值是什么？

发布于 2024-10-11 12:41:20 字数 526 浏览 2 评论 0原文

编译 CUDA 代码时，您必须选择为哪种架构生成代码。 nvcc 提供了两个参数来指定该架构，基本上：

arch 指定虚拟架构，可以是 compute_10、compute_11 所以
code 指定真实的架构，可以是 sm_10、sm_11 等。

像这样的命令：

nvcc x.cu -arch=compute_13 -code=sm_13

将生成 'cubin ' 具有 1.3 计算能力的设备的代码。如果我错了，请纠正我。我想知道这两个参数的默认值是什么？ 当没有指定 arch 或 code 值时，nvcc使用哪种默认架构？< /强>

原文

When compiling your CUDA code, you have to select for which architecture your code is being generated. nvcc provides two parameters to specify this architecture, basically:

arch specifies the virtual arquictecture, which can be compute_10, compute_11, etc.
code specifies the real architecture, which can be sm_10, sm_11, etc.

So a command like this:

nvcc x.cu -arch=compute_13 -code=sm_13

Will generate 'cubin' code for devices with 1.3 compute capability. Please correct me if I'm wrong. Which I would like to know is which are the default values for these two parameters? Which is the default architecture that nvcc uses when no value for arch or code is specified?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

帅气称霸 2024-10-18 12:41:20

好吧，我终于找到了默认值。我的错误是没有从头到尾阅读 NVCC 文档中有关 GPU 编译的整个章节。因此，

nvcc x.cu

相当于

nvcc x.cu –arch=compute_10 -code=sm_10,compute_10

这些是默认值。默认情况下会对虚拟架构 compute_10 执行编译，编译结果的 a.out 将包含 sm_10 的 CUBIN 代码> 真实架构，以及 compute_10 架构的 PTX 汇编代码，如果您的架构大于 sm_10，CUDA 驱动程序将“及时”重新编译该代码。

Ok, I've finally managed to discover the default values. My fault for not reading the whole chapter on GPU compilation in the NVCC documentation from the beginning to the very very end. So,

nvcc x.cu

is equivalent for

nvcc x.cu –arch=compute_10 -code=sm_10,compute_10

Those are the default values. The compilation is performed by default to the virtual architecture compute_10, and the a.out that results from the compilation will include the CUBIN code for the sm_10 real architecture, and the PTX assembly code for the compute_10 architecture, which will be recompiled 'just in time' by the CUDA driver if your architecture is greater than sm_10.

回复收藏 0 原文