将数据从 CPU 传递到 GPU,而不将其作为参数显式传递
是否可以将数据从 CPU 传递到 GPU 而无需显式将其作为参数传递?
我不想将其作为参数传递,主要是出于语法糖的原因 - 我需要传递大约 20 个常量参数,而且还因为我连续调用两个具有(几乎)相同参数的内核。
我想要类似的东西
__constant__ int* blah;
__global__ myKernel(...){
... i want to use blah inside ...
}
int main(){
...
cudaMalloc(...allocate blah...)
cudaMemcpy(copy my array from CPU to blah)
}
Is it possible to pass the data from CPU to GPU without explicitly passing it as a parameter?
I don't want to pass it as a parameter primarily for syntax sugar reasons - I have about 20 constant parameters I need to pass, and also because I successively invoke two kernels with (almost) same parameters.
I want something along the lines of
__constant__ int* blah;
__global__ myKernel(...){
... i want to use blah inside ...
}
int main(){
...
cudaMalloc(...allocate blah...)
cudaMemcpy(copy my array from CPU to blah)
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
cudaMemcpyToSymbol 似乎是您正在寻找的功能。它的工作原理类似于 cudaMemcpy,但有一个额外的“偏移”参数,看起来它可以更轻松地跨 2D 数组进行复制。
(我犹豫是否提供代码,因为我无法测试它 - 但请参阅 this线程和这篇文章供参考。)
cudaMemcpyToSymbol seems to be the function you're looking for. It works similarly to cudaMemcpy, but with an additional 'offset' argument which looks like it'll make it easier to copy across 2D arrays.
(I'm hesitant to provide code, since I'm unable to test it - but see this thread and this post for reference.)
使用 __device__ 来应用全局变量。类似于使用
__constant__
的方式use
__device__
to apply global variables. It's similar to the way of using__constant__
您可以采取一些方法。这取决于您将如何使用该数据。
例如:
您可以在内核中使用三个数组,而无需向内核传递任何参数。注意,这只是一个使用示例,并不是内存层次结构的优化使用,即:不建议以这种方式使用常量内存。
希望这有帮助。
You can take some approaches. It depends on how you are going to use that data.
In example:
You can use three arrays in the kernel without pass any parameter to the kernel. Note this is only an example of use and not an optimized use of the memory hierarchy, i.e.: Use the constant memory in this way is not recommended.
Hope this help.
使用“cudaMemcpyToSymbol”时要小心,如果您尝试将结构从 CPU 复制到 GPU,它可能会引入错误。
Be careful while using "cudaMemcpyToSymbol" it can introduce bugs if you are trying to copy a struct from CPU to GPU.