什么是真正的 C++ CUDA 设备代码支持的语言结构?

发布于 2024-10-15 19:11:11 字数 636 浏览 1 评论 0原文

CUDA 文档 3.2 版本的附录 D 提到了 CUDA 设备代码中的 C++ 支持。
明确提到CUDA支持“计算能力2.x设备的类”。但是,我正在使用计算能力 1.1 和 1.3 的设备,我可以使用此功能!

例如,这段代码可以工作:

// class definition voluntary simplified
class Foo {
  private:
    int x_;

  public:
    __device__ Foo() { x_ = 42; }
    __device__ void bar() { return x_; }
};


//kernel using the previous class
__global__ void testKernel(uint32_t* ddata) {
    Foo f;
    ddata[threadIdx.x] = f.bar(); 
}

我还可以使用广泛的库,例如 Thrust::random 随机生成类。 我唯一的猜测是,由于 __device__ 标记函数的自动内联,我能够做到这一点,但这并不能解释成员变量的处理。

您是否曾经在相同条件下使用过此类功能,或者您能否向我解释一下为什么我的 CUDA 代码会这样?参考指南有问题吗?

Appendix D of the 3.2 version of the CUDA documentation refers to C++ support in CUDA device code.
It is clearly mentioned that CUDA supports "Classes for devices of compute capability 2.x". However, I'm working with devices of compute capability 1.1 and 1.3 and I can use this feature!

For instance, this code works:

// class definition voluntary simplified
class Foo {
  private:
    int x_;

  public:
    __device__ Foo() { x_ = 42; }
    __device__ void bar() { return x_; }
};


//kernel using the previous class
__global__ void testKernel(uint32_t* ddata) {
    Foo f;
    ddata[threadIdx.x] = f.bar(); 
}

I'm also able to use widespread libraries such as Thrust::random random generation classes.
My only guess is that I'm able to do so thanks to the automatic inlining of __device__ marked function, but this does not explain the handling of member variables withal.

Have you ever used such features in the same conditions, or can you explain to me why my CUDA code behaves this way? Is there something wrong in the reference guide?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

維他命╮ 2024-10-22 19:11:11

官方规定,CUDA 不支持 2.0 之前的设备上的类。

实际上,根据我的经验,只要功能可以在编译时解析,您就可以在所有设备上使用所有 C++ 功能。 2.0之前的设备不支持函数调用(所有函数都是内联的)并且没有程序跳转到变量地址(仅跳转到常量地址)。

这意味着,您可以使用以下 C++ 构造:

  • 可见性(公共/受保护/私有)
  • 非虚拟继承
  • 整个模板编程和元编程(直到您发现 nvcc 错误;从版本 3.2 开始,其中有相当多)
  • 构造函数(除非对象是在 __ 共享 __ 内存中声明的)
  • 命名空间,

否则不能使用以下内容:

  • new &删除运算符(我相信设备 >=2.0 可以做到这一点)
  • 虚拟方法(需要在变量地址处跳转)
  • 函数递归(需要函数调用)
  • 异常

实际上,CUDA 编程指南第 D.6 章中的所有示例都可以针对设备 < 进行编译;2.0

Oficially, CUDA has no support for classes on devices prior to 2.0.

Practically, from my experience, you can use all C++ features on all devices as long as the functionality can be resolved at compile-time. Devices prior to 2.0 do not support function calls (all functions are inlined) and no program jumps to a variable address (only jumps at constant address).

This means, you can use the following C++ constructs:

  • Visibility (public/protected/private)
  • non-virtual inheritance
  • whole template programming and metaprogramming (until you stuble on nvcc bugs; there are quite a few of them as of version 3.2)
  • constructors (except when object is declared in __ shared __ memory)
  • namespaces

You cannot use the following:

  • new & delete operators (I believe devices >=2.0 can do that)
  • virtual methods (requires jumps at variable address)
  • function recursion (requires function calls)
  • exceptions

Actually, all examples in chapter D.6 of the CUDA Programming Guide can compile for devices <2.0

萌无敌 2024-10-22 19:11:11

一些 C++ 类功能可以工作,但是编程指南基本上说它没有得到完全支持,因此并非所有 C++ 类功能都可以工作。如果你能做你想做的事,那么你应该继续!

Some C++ class functionality will work, however the Programming Guide is basically saying that it's not fully supported and therefore not all C++ class functionality will work. If you can do what you're looking to do then you should go ahead!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文