为什么我会收到 CL_MEM_OBJECT_ALLOCATION_FAILURE?
我在 GPU 上分配一个 cl_mem 缓冲区并对其进行处理,这可以正常工作,直到超过一定尺寸。在这种情况下,分配本身会成功,但执行或复制不会成功。我确实想使用设备的内存来实现更快的操作,所以我这样分配:
buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);
现在我不明白的是大小限制。我正在复制大约 16 MB,但应该能够使用大约 128 MB(请参阅 CL_DEVICE_MAX_MEM_ALLOC_SIZE )。
为什么这些数字相差如此之大?
以下是 oclDeviceQuery 的一些摘录:
CL_PLATFORM_NAME: NVIDIA
CL_PLATFORM_VERSION: OpenCL 1.0
OpenCL SDK Version: 4788711
CL_DEVICE_NAME: GeForce 8600 GTS
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 255 MByte
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
I'm allocating a cl_mem buffer on a GPU and work on it, which works fine until a certain size is exceeded. In that case the allocation itself succeeds, but execution or copying does not. I do want to use the device's memory for faster operation so I allocate like:
buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);
Now what I don't understand is the size limit. I'm copying about 16 Mbyte but should be able to use about 128 Mbyte (see CL_DEVICE_MAX_MEM_ALLOC_SIZE
).
Why do these numbers differ so much ?
Here's some excerpt from oclDeviceQuery:
CL_PLATFORM_NAME: NVIDIA
CL_PLATFORM_VERSION: OpenCL 1.0
OpenCL SDK Version: 4788711
CL_DEVICE_NAME: GeForce 8600 GTS
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 255 MByte
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
clCreateBuffer 实际上不会在设备上创建缓冲区。这是有道理的,因为在创建时驱动程序不知道哪个设备将使用缓冲区(回想一下上下文可以有多个设备)。当您将写入入队或启动将缓冲区作为参数的内核时,将在实际设备上创建缓冲区。
至于16MB限制,您使用的是最新的驱动程序(195.xx)吗?如果是这样,您应该通过论坛或直接联系 NVIDIA。
clCreateBuffer will not actually create a buffer on the device. This makes sense, since at the time of creation the driver does not know which device will use the buffer (recall that a context can have multiple devices). The buffer will be created on the actual device when you enqueue a write or when you launch a kernel that takes the buffer as a parameter.
As for the 16MB limit, are you using the latest driver (195.xx)? If so you should contact NVIDIA either through the forums or directly.
不要忘记您在设备上使用的任何其他内存(如果这也是您的显卡,请不要忘记您的显示器正在使用的内存)。
(有没有办法获取当前可用内存,或最大片段,或类似的东西?)
Don't forget whatever other memory you happen to have used on the device (and, if this is also your graphics card, the memory that your display is using).
(Is there a way to get the current available memory, or the largest fragment, or somesuch?)