将纹理内存绑定到 GPU 分配的矩阵

发布于 2024-11-27 01:38:20 字数 1071 浏览 2 评论 0原文

我在 GPU 上创建了一个大小为 (p7P_NXSTATES)x(p7P_NXTRANS) 的浮点矩阵，如下所示：

// Special Transitions
// Host pointer to array of device pointers
float **tmp_xsc = (float**)(malloc(p7P_NXSTATES * sizeof(float*)));
// For every alphabet in scoring profile...
for(i = 0; i < p7P_NXSTATES; i++)
{
    // Allocate memory for device for every alphabet letter in protein sequence
    cudaMalloc((void**)&(tmp_xsc[i]), p7P_NXTRANS * sizeof(float));
    // Copy over arrays
    cudaMemcpy(tmp_xsc[i], gm.xsc[i], p7P_NXTRANS * sizeof(float), cudaMemcpyHostToDevice);
}
// Copy device pointers to array of device pointers on GPU (matrix)
float **dev_xsc;
cudaMalloc((void***)&dev_xsc, p7P_NXSTATES * sizeof(float*));
cudaMemcpy(dev_xsc, tmp_xsc, p7P_NXSTATES * sizeof(float*), cudaMemcpyHostToDevice);

该内存一旦复制到 GPU，就永远不会更改，只能读取。因此，我决定将其绑定到纹理内存。问题是，当使用 2D 纹理内存时，绑定到它的内存实际上只是一个使用偏移量充当矩阵的数组。

我知道我需要使用 cudaBindTexture2D() 和 cudaCreateChannelDesc() 绑定此 2D 内存才能访问它

tex2D(texXSC,x,y)

- 但我只是不知道如何。有什么想法吗？

原文

I created a float point matrix on the GPU of size (p7P_NXSTATES)x(p7P_NXTRANS) like so:

// Special Transitions
// Host pointer to array of device pointers
float **tmp_xsc = (float**)(malloc(p7P_NXSTATES * sizeof(float*)));
// For every alphabet in scoring profile...
for(i = 0; i < p7P_NXSTATES; i++)
{
    // Allocate memory for device for every alphabet letter in protein sequence
    cudaMalloc((void**)&(tmp_xsc[i]), p7P_NXTRANS * sizeof(float));
    // Copy over arrays
    cudaMemcpy(tmp_xsc[i], gm.xsc[i], p7P_NXTRANS * sizeof(float), cudaMemcpyHostToDevice);
}
// Copy device pointers to array of device pointers on GPU (matrix)
float **dev_xsc;
cudaMalloc((void***)&dev_xsc, p7P_NXSTATES * sizeof(float*));
cudaMemcpy(dev_xsc, tmp_xsc, p7P_NXSTATES * sizeof(float*), cudaMemcpyHostToDevice);

This memory, once copied over to the GPU, is never changed and is only read from. Thus, I've decided to bind this to texture memory. Problem is that when working with 2D texture memory, the memory being bound to it is really just an array that uses offsets to function as a matrix.

I'm aware I need to use cudaBindTexture2D() and cudaCreateChannelDesc() to bind this 2D memory in order to access it as such