如何从 CUDA 内核函数返回单个变量?
我有一个 CUDA 搜索函数,可以计算一个变量。我怎样才能把它退回来。
__global__
void G_SearchByNameID(node* Node, long nodeCount, long start,char* dest, long answer){
answer = 2;
}
cudaMemcpy(h_answer, d_answer, sizeof(long), cudaMemcpyDeviceToHost);
cudaFree(d_answer);
对于这两行我都收到此错误: 错误:“long”类型的参数与“const void *”类型的参数不兼容
I have a CUDA search function which calculate one single variable. How can I return it back.
__global__
void G_SearchByNameID(node* Node, long nodeCount, long start,char* dest, long answer){
answer = 2;
}
cudaMemcpy(h_answer, d_answer, sizeof(long), cudaMemcpyDeviceToHost);
cudaFree(d_answer);
for both of these lines I get this error:
error: argument of type "long" is incompatible with parameter of type "const void *"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
为此,我一直使用 __device__ 变量,这样您就不必担心
cudaMalloc
和cudaFree
必须传递一个指针作为内核参数,这会在内核中保存一个寄存器来启动。I've been using
__device__
variables for this purpose, that way you don't have to bother withcudaMalloc
andcudaFree
and you don't have to pass a pointer as a kernel argument, which saves you a register in your kernel to boot.要获得单个结果,您必须对其进行 Memcpy,即:
我猜错误的出现是因为您传递的是长值,而不是指向长值的指针。
To get a single result you have to Memcpy it, ie:
I guess the error come because you are passing a long value, instead of a pointer to a long value.