从私有内存复制到全局内存时出现 OpenCL 无效命令队列错误
我正在尝试修复程序中的错误,并将其精确定位到非常小的区域。
每当我尝试将数据从设备的私有内存复制到全局内存时,命令队列就会失效,并且 clFinish() 返回错误。
考虑一个简单的例子:
kernel void example(global int *data, const int width) {
int id = get_global_id(0);
if (id == 0) {
int copy[width]; // private memory?
for (int i = 0; i < width; i++) {
copy[i] = data[i]; // works
data[i] = copy[i]; // works
}
// whenever this loop is here
// i get invalid command queue from clFinish
for (int i = 0; i < width; i++) {
data[i] = copy[i];
}
}
}
有人可以向我解释一下为什么会这样吗?
谢谢
I am trying to fix an error in the program and I pinpointed it to the really small area.
Whenever I am trying to copy data from private memory of the device into the global memory, command queue gets invalidated, and clFinish() returns an error.
Consider a simple example:
kernel void example(global int *data, const int width) {
int id = get_global_id(0);
if (id == 0) {
int copy[width]; // private memory?
for (int i = 0; i < width; i++) {
copy[i] = data[i]; // works
data[i] = copy[i]; // works
}
// whenever this loop is here
// i get invalid command queue from clFinish
for (int i = 0; i < width; i++) {
data[i] = copy[i];
}
}
}
So can somebody explain to me why is that the reason?
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果宽度确实超过了最大尺寸,私有内存就没有问题。例如,我建议您使用 width=8/16 运行内核,并查看结果。如果您曾经传递一个较大的宽度值。可能无法将所有数据保存在私有内存中。它们是寄存器,大小非常有限。
If the width does exceed the maximum size, the private memory will be fine. I recommend you to run the kernel with width=8/16, for example, and see the result. If you used to pass a large value for width. It might not be possible to hold all data in the private memory. They are registers and have very limited size.