cudaMemcpy2D 的分段错误

发布于 2024-12-12 08:06:23 字数 781 浏览 0 评论 0原文

我有一个存储在 GPU 中的 2D 数组 dev_histogram 和一个存储在 CPU 中的 2D 数组 histogarm 。我想将 dev_histogram 的内容复制到直方图中。以下是我的程序的相关部分。我也可以发布完整的代码。

int *dev_histogram; // Array for histogram, GPU
int histogram[SIZE_THETA][SIZE_RHO]; // Array for histogram, CPU

size_t pitch;
histogramSize = sizeof(int) * SIZE_THETA * SIZE_RHO;
cudaMallocPitch((void**)&dev_histogram, &pitch, SIZE_THETA * sizeof(int), SIZE_RHO)

houghTransformation << <width, height >> >(dev_edges, dev_histogram, pitch, n_pixels, width, height);

// Here I get a Segmentation fault:
cudaMemcpy2D(histogram, pitch, dev_histogram, SIZE_THETA * sizeof(int), SIZE_THETA * sizeof(int), SIZE_RHO * sizeof(int), cudaMemcpyDeviceToHost)

您能帮我了解如何将矩阵复制回来吗？大多数情况下，我对如何作为我的来源的宣传感到困惑。

原文

I have a 2D array dev_histogram stored in GPU and a 2D array histogarm stored in CPU. I want to copy content of dev_histogram into histogram. Below are relevant bits of my program. I can post full code as well.

int *dev_histogram; // Array for histogram, GPU
int histogram[SIZE_THETA][SIZE_RHO]; // Array for histogram, CPU

size_t pitch;
histogramSize = sizeof(int) * SIZE_THETA * SIZE_RHO;
cudaMallocPitch((void**)&dev_histogram, &pitch, SIZE_THETA * sizeof(int), SIZE_RHO)

houghTransformation << <width, height >> >(dev_edges, dev_histogram, pitch, n_pixels, width, height);

// Here I get a Segmentation fault:
cudaMemcpy2D(histogram, pitch, dev_histogram, SIZE_THETA * sizeof(int), SIZE_THETA * sizeof(int), SIZE_RHO * sizeof(int), cudaMemcpyDeviceToHost)

Could you please help me understand how to copy my matrix back? Mostly, I am confused with what to put as pitch for my source.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你是年少的欢喜 2024-12-19 08:06:23

指定 SIZE_RHO 作为高度，而不是 SIZE_RHO * sizeof(int)：

<cudaMemcpy2D(histogram, pitch, dev_histogram, SIZE_THETA * sizeof(int), SIZE_THETA * sizeof(int), SIZE_RHO * sizeof(int), cudaMemcpyDeviceToHost);
>cudaMemcpy2D(histogram, pitch, dev_histogram, SIZE_THETA * sizeof(int), SIZE_THETA * sizeof(int), SIZE_RHO, cudaMemcpyDeviceToHost);

Specify SIZE_RHO as the height, not SIZE_RHO * sizeof(int):

<cudaMemcpy2D(histogram, pitch, dev_histogram, SIZE_THETA * sizeof(int), SIZE_THETA * sizeof(int), SIZE_RHO * sizeof(int), cudaMemcpyDeviceToHost);
>cudaMemcpy2D(histogram, pitch, dev_histogram, SIZE_THETA * sizeof(int), SIZE_THETA * sizeof(int), SIZE_RHO, cudaMemcpyDeviceToHost);

回复收藏 0 原文

倾城泪 2024-12-19 08:06:23

在 CUDA 工具包参考手册中，您可以看到 cudaMallocPitch 中的间距是为要复制的 2D 数组分配的宽度（以字节为单位）。您的 dev_histogram 将具有等于间距的实际宽度和等于您指定的高度的高度。 2D 数组的每一行都分配有间距字节，但只有 width*sizeof(int) 字节的有效数据。

在同一文档中，cudaMemcpy2D 的原型位于

cudaError_t cudaMemcpy2D (void ∗ dst, size_t dpitch, const void ∗ src, size_t spitch, size_t width, size_t height, enum cudaMemcpyKind kind)

此处 dst 是主机上的数组，dpitch 是目标数组（直方图）的字节宽度，spitch 是源数组（dev_histogram）的字节宽度。宽度和高度是二维数组的尺寸。
那么你必须这样称呼它：

cudaMemcpy2D(histogram, SIZE_THETA*sizeof(int), dev_histogram, pitch, SIZE_THETA * sizeof(int), SIZE_RHO, cudaMemcpyDeviceToHost);

编辑：在ArchaeaSoftware之后我注意到高度实际上是行数，字节数的高度没有意义。更新了答案，因为您仍然需要更改音调。

In the CUDA toolkit reference manual you can see that the pitch in the cudaMallocPitch is the allocated width in bytes for the 2D array you are copying. Your dev_histogram will have an actual width equal to pitch and height equal to your specified height. Each line of your 2D array has pitch bytes allocated but only width*sizeof(int) bytes of valid data.

In the same document the prototype for cudaMemcpy2D is

cudaError_t cudaMemcpy2D (void ∗ dst, size_t dpitch, const void ∗ src, size_t spitch, size_t width, size_t height, enum cudaMemcpyKind kind)

here dst is your array on the host, dpitch is the width in bytes of the destination array (histogram) and spitch is the width in bytes of the source array (dev_histogram). width and height are the dimensions of your 2D array.
You must call it like this then:

cudaMemcpy2D(histogram, SIZE_THETA*sizeof(int), dev_histogram, pitch, SIZE_THETA * sizeof(int), SIZE_RHO, cudaMemcpyDeviceToHost);

Edit: after ArchaeaSoftware I noticed that indeed the height is really number of rows, height in number of bytes doesn't make sense. Updated answer because you still need to change the pitches.

回复收藏 0 原文