3D 数组表示 CUDA

发布于 2024-12-03 05:32:28 字数 427 浏览 1 评论 0原文

我有一个 3D 图像。我需要使用指针将该图像复制到 cuda 的全局内存中。 目前我正在执行以下操作。在以下实现中,数组是线性一维数组。

   float *image = new float[noOfVoxels];
   readImage(image) //one D linear array
   int sizef = noOfVoxels*sizeof(float);
   float *devI;
   cudaMalloc((void**)&devI, sizef);
   cudaMemcpy(devI, image,sizef, cudaMemcpyHostToDevice);

如何在设备内存中分配 3D 数组?

    3D array
    float image[][][];

I have a 3D image. I need to copy that image to cuda's GLOBAL MEMORY by the use of pointers.
Currently I am doing as the following.. In the following implementation the array is a linear 1D array.

   float *image = new float[noOfVoxels];
   readImage(image) //one D linear array
   int sizef = noOfVoxels*sizeof(float);
   float *devI;
   cudaMalloc((void**)&devI, sizef);
   cudaMemcpy(devI, image,sizef, cudaMemcpyHostToDevice);

How can I allocate a 3D array in device memory??

    3D array
    float image[][][];

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

深海里的那抹蓝 2024-12-10 05:32:28

一旦数据位于 GPU 上,您打算如何访问数据?

如果您正在进行大量随机访问并且会从空间局部性中受益,那么您应该使用 cudaMalloc3D 并将其绑定到 3D 纹理。

如果您正在进行可预测的合并访问,那么您现在拥有的线性内存索引就非常有用。

How are you planning to access the data once it's on the GPU?

If you're doing lots of random accesses and would benefit from spatial locality, then you should use cudaMalloc3D and bind it to a 3D texture.

If you're doing predictable, coalesced accesses, then linear memory indexing as you have it now is great.

裸钻 2024-12-10 05:32:28

请注意,您的 PC 内存不是 3D 的。这只是可视化的问题,因此您可以将 3D 图像转换为单个指针。那么为什么不将 3D 图像保留在主机端的单个指针上呢?

accessing image3D[i][j][z] is same as image3D[ i*cols+j + rows*cols*z];

现在将单指针 image3D 提供给 CUDA。

Note that the memory of your PC is not in 3D. It's just the matter of visualization, so you can convert your 3D image into a single pointer. So why don't you keep the 3D image in the form on a single pointer on Host side.

accessing image3D[i][j][z] is same as image3D[ i*cols+j + rows*cols*z];

Now feed the singled-pointer image3D to CUDA.

咆哮 2024-12-10 05:32:28

您最好使用 cudaMallocPitch() 。它仍然将内存分配为单个块,即 1d,您必须通过在 3d 下标和 1d 索引之间转换来访问它,但好处是它以优化数据类型对齐的方式分配内存。

或者 cudaMalloc3D() 还将返回指向倾斜设备内存的指针

You are best off using cudaMallocPitch(). It still allocates memory as a single chunk i.e 1d which you must access by converting between 3d subcripts and a 1d index, but the benefit is that it allocates the memory in such a way as to optimize the alignment of the data types.

Alternatively cudaMalloc3D() will also return a pointer to pitched device memory

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文