使用 pthreads 的 cuda 内核缺少配置错误
cuda 中缺少配置错误是什么意思? 下面的代码是一个线程函数,当我运行此代码时,获得的错误为 1,这意味着缺少配置错误。这段代码有什么错误?
void* run(void *args)
{
cudaError_t error;
Matrix *matrix=(Matrix*)args;
int scalar=2;
dim3 dimGrid(1,1,1);
dim3 dimBlock(1024,1,1);
cudaEvent_t start,stop;
cudaSetDevice(0);
cudaEventCreate(&start);
cudaEventCreate(&stop);
cudaEventRecord(start,0);
for(int i=0 ;i< matrix->number ;i++ )
{
syntheticKernel<<<dimGrid,dimBlock>>>();
cudaThreadSynchronize();
}
cudaEventRecord(stop,0);
cudaEventSynchronize(stop);
cudaEventElapsedTime(&matrix->time,start,stop);
error=cudaGetLastError();
assert(error!=0);
printf("%d\n",error);
}
What is the meaining of missing configuration error in cuda ?
This below code is a thread function, when I run this code the error obtained is 1 which implies missing configuration error. what is the mistake in this code ?
void* run(void *args)
{
cudaError_t error;
Matrix *matrix=(Matrix*)args;
int scalar=2;
dim3 dimGrid(1,1,1);
dim3 dimBlock(1024,1,1);
cudaEvent_t start,stop;
cudaSetDevice(0);
cudaEventCreate(&start);
cudaEventCreate(&stop);
cudaEventRecord(start,0);
for(int i=0 ;i< matrix->number ;i++ )
{
syntheticKernel<<<dimGrid,dimBlock>>>();
cudaThreadSynchronize();
}
cudaEventRecord(stop,0);
cudaEventSynchronize(stop);
cudaEventElapsedTime(&matrix->time,start,stop);
error=cudaGetLastError();
assert(error!=0);
printf("%d\n",error);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您能否添加有关您的程序的更多详细信息?每个 CUDA API 例程都会返回一个状态代码,您应该检查每个 API 调用的状态以捕获并解码第一个报告的错误。
需要检查的一点是,在分叉 pthread 之前,您没有调用任何 CUDA API 例程。在分叉线程之前创建 CUDA 上下文(对于大多数(但不是全部)CUDA API 例程来说是自动的)会导致问题。检查这一点,如果不是问题,请向您的问题添加更多详细信息,并检查所有 API 调用的返回值。
Can you add more detail about your program please? The CUDA API routines each return a status code, you should check the status of each API call to catch and decode the first reported error.
One point to check is that you have not called any CUDA API routines before you fork the pthreads. Creating a CUDA context (which is automatic for most, but not all, CUDA API routines) before you fork the threads will cause problems. Check this, and if it's not the problem add more details to your question and check the return value of all API calls.
为什么要在网格中启动单个块?此配置似乎很可疑:
尝试增加网格大小并在块中放置更少的线程。但正如汤姆所建议的那样,你的主要问题可能与上下文有关。
Why are you launching a single block in a Grid? This configuration seems suspicious:
Try increasing the grid size and putting less threads in a block. But your main problem is probably about contexts as Tom suggests.