为什么我会收到 CL_MEM_OBJECT_ALLOCATION_FAILURE?

发布于 2024-08-10 12:28:48 字数 943 浏览 1 评论 0原文

我在 GPU 上分配一个 cl_mem 缓冲区并对其进行处理,这可以正常工作,直到超过一定尺寸。在这种情况下,分配本身会成功,但执行或复制不会成功。我确实想使用设备的内存来实现更快的操作,所以我这样分配:

buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);

现在我不明白的是大小限制。我正在复制大约 16 MB,但应该能够使用大约 128 MB(请参阅 CL_DEVICE_MAX_MEM_ALLOC_SIZE )。

为什么这些数字相差如此之大?


以下是 oclDeviceQuery 的一些摘录:

 CL_PLATFORM_NAME:  NVIDIA
 CL_PLATFORM_VERSION:  OpenCL 1.0 
 OpenCL SDK Version:  4788711

  CL_DEVICE_NAME:          GeForce 8600 GTS
  CL_DEVICE_TYPE:          CL_DEVICE_TYPE_GPU
  CL_DEVICE_ADDRESS_BITS:              32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:  128 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:     255 MByte
  CL_DEVICE_LOCAL_MEM_TYPE:      local
  CL_DEVICE_LOCAL_MEM_SIZE:      16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:  64 KByte

I'm allocating a cl_mem buffer on a GPU and work on it, which works fine until a certain size is exceeded. In that case the allocation itself succeeds, but execution or copying does not. I do want to use the device's memory for faster operation so I allocate like:

buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);

Now what I don't understand is the size limit. I'm copying about 16 Mbyte but should be able to use about 128 Mbyte (see CL_DEVICE_MAX_MEM_ALLOC_SIZE ).

Why do these numbers differ so much ?


Here's some excerpt from oclDeviceQuery:

 CL_PLATFORM_NAME:  NVIDIA
 CL_PLATFORM_VERSION:  OpenCL 1.0 
 OpenCL SDK Version:  4788711

  CL_DEVICE_NAME:          GeForce 8600 GTS
  CL_DEVICE_TYPE:          CL_DEVICE_TYPE_GPU
  CL_DEVICE_ADDRESS_BITS:              32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:  128 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:     255 MByte
  CL_DEVICE_LOCAL_MEM_TYPE:      local
  CL_DEVICE_LOCAL_MEM_SIZE:      16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:  64 KByte

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

丑丑阿 2024-08-17 12:28:48

clCreateBuffer 实际上不会在设备上创建缓冲区。这是有道理的,因为在创建时驱动程序不知道哪个设备将使用缓冲区(回想一下上下文可以有多个设备)。当您将写入入队或启动将缓冲区作为参数的内核时,将在实际设备上创建缓冲区。

至于16MB限制,您使用的是最新的驱动程序(195.xx)吗?如果是这样,您应该通过论坛或直接联系 NVIDIA。

clCreateBuffer will not actually create a buffer on the device. This makes sense, since at the time of creation the driver does not know which device will use the buffer (recall that a context can have multiple devices). The buffer will be created on the actual device when you enqueue a write or when you launch a kernel that takes the buffer as a parameter.

As for the 16MB limit, are you using the latest driver (195.xx)? If so you should contact NVIDIA either through the forums or directly.

零度° 2024-08-17 12:28:48

不要忘记您在设备上使用的任何其他内存(如果这也是您的显卡,请不要忘记您的显示器正在使用的内存)。

(有没有办法获取当前可用内存,或最大片段,或类似的东西?)

Don't forget whatever other memory you happen to have used on the device (and, if this is also your graphics card, the memory that your display is using).

(Is there a way to get the current available memory, or the largest fragment, or somesuch?)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文