nvcc：结合外部和常量

发布于 2024-12-12 19:35:39 字数 790 浏览 4 评论 0原文

我想将 CUDA 代码组织到单独的目标文件中，以便在编译结束时链接，就像在 C++ 中一样。为此，我希望能够在头文件中声明一个指向 __constant__ 内存的外部指针，并将定义放入其中一个 .cu 文件中，同样遵循 C++ 的模式。但似乎当我这样做时，nvcc 会忽略“extern” - 它将每个声明作为定义。有办法解决这个问题吗？

为了更具体地了解代码和错误，我将其放在头文件中：

extern __device__ void* device_function_table[];

然后在 .cu 文件中：

void* __device__ device_function_table[200];

这在编译时给出了此错误：

(path).cu:40: error: redefinition of ‘void* device_function_table [200]’
(path).hh:29: error: ‘void* device_function_table [200]’ previously declared here

我当前的解决方案是使用 Makefile magic 将我所有的 .cu 放在一起文件，实际上有一个大的翻译单元，但有一些类似的文件组织。但这已经明显减慢了编译速度，因为对我的任何一个类的更改都意味着重新编译所有类；我预计还会增加几个课程。

编辑：我看到我在文本中放入了 __constant__ ，在示例中放入了 __device__ ；这个问题对两者都适用。

原文

I would like to organise my CUDA code into separate object files to be linked at the end of compiling, as in C++. To that end I'd like to be able to declare an extern pointer to __constant__ memory in a header file, and put the definition in one of the .cu files, also following the pattern from C++. But it seems that when I do so, nvcc ignores the 'extern' - it takes each declaration as a definition. Is there a way around this?

To be more specific about the code and the errors, I have this in a header file:

extern __device__ void* device_function_table[];

followed by this in a .cu file:

void* __device__ device_function_table[200];

which gives this error on compiling:

(path).cu:40: error: redefinition of ‘void* device_function_table [200]’
(path).hh:29: error: ‘void* device_function_table [200]’ previously declared here

My current solution is to use Makefile magic to glob together all my .cu files and have, in effect, one big translation unit but some semblance of file organisation. But this is already slowing down compiles noticeably, since a change to any one of my classes means recompiling all of them; and I anticipate adding several more classes.

Edit: I see I put __constant__ in the text and __device__ in the example; the question applies to both.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

二手情话 2024-12-19 19:35:39

长话短说，使用最新的 CUDA 工具包（我使用的是 v8）并且计算能力至少为 2.0，在 Visual Studio 中，转到“项目属性”->“ CUDA C/C++ ->常见，在列表中找到“生成可重定位设备代码”，将其设置为“是（-rdc=true）”。

对于命令行此页面建议– dc 编译器选项

回复收藏 0 原文

玩心态 2024-12-19 19:35:39

来自 CUDA C 编程指南版本4.0，D.2.1.1 节：

不允许使用 __device__ 、 __shared__ 和 __constant__ 限定符
上：
类、结构和联合数据成员，
形式参数，
在主机上执行的函数内的局部变量。
__shared__ 和 __constant__ 变量具有隐含的静态存储。
__device__ 和 __constant__ 变量仅允许在文件范围内使用。
__device__、__shared__ 和 __constant__ 变量不能使用 extern 关键字定义为外部变量。唯一的例外是动态分配的 __shared__ 变量，如 B.2.3 节中所述。