3D 数组表示 CUDA
我有一个 3D 图像。我需要使用指针将该图像复制到 cuda 的全局内存中。 目前我正在执行以下操作。在以下实现中,数组是线性一维数组。
float *image = new float[noOfVoxels];
readImage(image) //one D linear array
int sizef = noOfVoxels*sizeof(float);
float *devI;
cudaMalloc((void**)&devI, sizef);
cudaMemcpy(devI, image,sizef, cudaMemcpyHostToDevice);
如何在设备内存中分配 3D 数组?
3D array
float image[][][];
I have a 3D image. I need to copy that image to cuda's GLOBAL MEMORY by the use of pointers.
Currently I am doing as the following.. In the following implementation the array is a linear 1D array.
float *image = new float[noOfVoxels];
readImage(image) //one D linear array
int sizef = noOfVoxels*sizeof(float);
float *devI;
cudaMalloc((void**)&devI, sizef);
cudaMemcpy(devI, image,sizef, cudaMemcpyHostToDevice);
How can I allocate a 3D array in device memory??
3D array
float image[][][];
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
一旦数据位于 GPU 上,您打算如何访问数据?
如果您正在进行大量随机访问并且会从空间局部性中受益,那么您应该使用 cudaMalloc3D 并将其绑定到 3D 纹理。
如果您正在进行可预测的合并访问,那么您现在拥有的线性内存索引就非常有用。
How are you planning to access the data once it's on the GPU?
If you're doing lots of random accesses and would benefit from spatial locality, then you should use cudaMalloc3D and bind it to a 3D texture.
If you're doing predictable, coalesced accesses, then linear memory indexing as you have it now is great.
请注意,您的 PC 内存不是 3D 的。这只是可视化的问题,因此您可以将 3D 图像转换为单个指针。那么为什么不将 3D 图像保留在主机端的单个指针上呢?
现在将单指针 image3D 提供给 CUDA。
Note that the memory of your PC is not in 3D. It's just the matter of visualization, so you can convert your 3D image into a single pointer. So why don't you keep the 3D image in the form on a single pointer on Host side.
Now feed the singled-pointer image3D to CUDA.
您最好使用
cudaMallocPitch()
。它仍然将内存分配为单个块,即 1d,您必须通过在 3d 下标和 1d 索引之间转换来访问它,但好处是它以优化数据类型对齐的方式分配内存。或者 cudaMalloc3D() 还将返回指向倾斜设备内存的指针
You are best off using
cudaMallocPitch()
. It still allocates memory as a single chunk i.e 1d which you must access by converting between 3d subcripts and a 1d index, but the benefit is that it allocates the memory in such a way as to optimize the alignment of the data types.Alternatively cudaMalloc3D() will also return a pointer to pitched device memory