获取CUDA纹理问题

发布于 2024-11-03 05:36:55 字数 1555 浏览 1 评论 0原文

我在获取浮动纹理时遇到问题。纹理定义如下：

texture<float, 2, cudaReadModeElementType> cornerTexture;

绑定和参数设置为：

cornerTexture.addressMode[0]    = cudaAddressModeClamp;
cornerTexture.addressMode[1]    = cudaAddressModeClamp;
cornerTexture.filterMode        = cudaFilterModePoint;
cornerTexture.normalized        = false;
cudaChannelFormatDesc cornerDescription = cudaCreateChannelDesc<float>();


cudaBindTexture2D(0, &cornerTexture, cornerImage->imageData_device, &cornerDescription, cornerImage->width, cornerImage->height, cornerImage->widthStep);

height 和 width 是按元素数量表示的两个维度的大小。 widthStep 是指字节数。内核内访问的发生方式如下：

thisValue = tex2D(cornerTexture, thisPixel.x, thisPixel.y);
printf("thisPixel.x: %i thisPixel.y: %i thisValue: %f\n", thisPixel.x, thisPixel.y, thisValue);

thisValue 应始终为非负浮点数。 printf() 给了我奇怪的、无用的值，这些值与线性内存实际存储的值不同。我尝试在两个坐标上使用 0.5f 来偏移访问，但它给出了相同的错误结果。

有什么想法吗？

更新似乎存在隐藏的对齐要求。据我推断，传递给 cudaBindTexture 函数的音高需要是 32 字节的倍数。例如，

cudaBindTexture2D(0, &debugTexture, deviceFloats, &debugDescription, 10, 32, 40)

在获取纹理时，以下内容会给出不正确的结果，但以下内容（宽度和高度已切换的同一数组）效果很好：

cudaBindTexture2D(0, &debugTexture, deviceFloats, &debugDescription, 32, 10, 128)

我不确定我是否遗漏了某些内容，或者确实对音高有限制。

更新 2：我已向 Nvidia 提交了错误报告。有兴趣的可以去他们的开发者专区查看，不过我会在这里回复的。

原文

I am having trouble fetching a texture of floats. The texture is defined as follows:

texture<float, 2, cudaReadModeElementType> cornerTexture;

The binding and parameter settings are:

cornerTexture.addressMode[0]    = cudaAddressModeClamp;
cornerTexture.addressMode[1]    = cudaAddressModeClamp;
cornerTexture.filterMode        = cudaFilterModePoint;
cornerTexture.normalized        = false;
cudaChannelFormatDesc cornerDescription = cudaCreateChannelDesc<float>();


cudaBindTexture2D(0, &cornerTexture, cornerImage->imageData_device, &cornerDescription, cornerImage->width, cornerImage->height, cornerImage->widthStep);

height and width are the sizes of the two dimensions in terms of numbers of elements. widthStep is in terms of number of bytes. In-kernel access occurs as follows:

thisValue = tex2D(cornerTexture, thisPixel.x, thisPixel.y);
printf("thisPixel.x: %i thisPixel.y: %i thisValue: %f\n", thisPixel.x, thisPixel.y, thisValue);

thisValue should always be a non-negative float. printf() is giving me strange, useless values that are different from what the linear memory actually stores. I have tried offsetting the access with a 0.5f on both coordinates, but it gives me the same wrong results.

Any ideas?

Update There seems to be a hidden alignment requirement. From what I can deduce, the pitch passed to the cudaBindTexture function needs to be a multiple of 32 bytes. For example, the following gives incorrect results

cudaBindTexture2D(0, &debugTexture, deviceFloats, &debugDescription, 10, 32, 40)

when fetching the texture, but the following (the same array with its width and height switched) works well:

cudaBindTexture2D(0, &debugTexture, deviceFloats, &debugDescription, 32, 10, 128)

I'm not sure whether I'm missing something or there really is a constraint on the pitch.

Update 2: I have filed a bug report with Nvidia. Those who are interested can view it in their developer zone, but I will post the reply back here.

分享到QQ

分享到微博