为现有全局内存阵列分配更多内存

发布于 2024-10-31 17:38:32 字数 272 浏览 0 评论 0原文

是否可以将内存添加到全局内存中先前分配的数组中？

我需要做的是：

//cudamalloc memory for d_A
int n=0;int N=100;
do
{
 Kernel<<< , >>> (d_A,n++);
 //add N memory to d_A
 while(n!=5)}

执行另一个 cudamalloc 是否会删除先前分配的数组的值？就我而言，应保留先前分配的数组的值......

原文

is it possible to add memory to a previously allocated array in global memory?

what i need to do is this:

//cudamalloc memory for d_A
int n=0;int N=100;
do
{
 Kernel<<< , >>> (d_A,n++);
 //add N memory to d_A
 while(n!=5)}

does doing another cudamalloc removes the values of the previously allocated array? in my case the values of the previous allocated array should be kept...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青朷 2024-11-07 17:38:32

首先，cudaMalloc 的行为类似于 malloc，而不是 realloc。这意味着 cudaMalloc 将在新位置分配全新的设备内存。 cuda API 中没有 realloc 函数。

其次，作为解决方法，您可以再次使用 cudaMalloc 来分配更多内存。请记住在为 d_a 分配新地址之前使用 cudaFree 释放设备指针。以下代码在功能上就是您想要的。

int n=0;int N=100;

//set the initial memory size
size = <something>;

do
{
    //allocate just enough memory
    cudaMalloc((void**) &d_A, size);

    Kernel<<< ... >>> (d_A,n++);   

    //free memory allocated for d_A
    cudaFree(d_A);

    //increase the memory size
    size+=N;

while(n!=5)}

第三，cudaMalloc 可能是一个昂贵的操作，我预计上面的代码会相当慢。我认为你应该考虑为什么要增加数组。您能否为 d_A 分配一次内存，并为最大的用例提供足够的内存？如果您知道以后需要 1,000 字节，则可能没有理由只分配 100 字节！

//calculate the max memory requirement
MAX_SIZE = <something>;

//allocate only once
cudaMalloc((void**) &d_A, MAX_SIZE);

//use for loops when they are appropriate
for(n=0; n<5; n++)
{
    Kernel<<< ... >>> (d_A,n);
}

First, cudaMalloc behaves like malloc, not realloc. This means that cudaMalloc will allocate totally new device memory at a new location. There is no realloc function in the cuda API.

Secondly, as a workaround, you can just use cudaMalloc again to allocate more more memory. Remember to free the device pointer with cudaFree before you assign a new address to d_a. The following code is functionally what you want.

int n=0;int N=100;

//set the initial memory size
size = <something>;

do
{
    //allocate just enough memory
    cudaMalloc((void**) &d_A, size);

    Kernel<<< ... >>> (d_A,n++);   

    //free memory allocated for d_A
    cudaFree(d_A);

    //increase the memory size
    size+=N;

while(n!=5)}

Thirdly, cudaMalloc can be an expensive operation, and I expect the above code will be rather slow. I think you should consider why you want to grow the array. Can you allocate memory for d_A one time with enough memory for the largest use case? There is likely no reason to allocate only 100 bytes if you know you need 1,000 bytes later on!

//calculate the max memory requirement
MAX_SIZE = <something>;

//allocate only once
cudaMalloc((void**) &d_A, MAX_SIZE);

//use for loops when they are appropriate
for(n=0; n<5; n++)
{
    Kernel<<< ... >>> (d_A,n);
}

回复收藏 0 原文

云柯 2024-11-07 17:38:32

您的伪代码根本不会“向先前分配的数组添加内存”。增加现有分配大小的标准 C 方法是通过 realloc() 函数，并且在撰写本文时，还没有与 realloc() 等效的 CUDA。

当您执行

cudaMalloc(d_A....)

// something

cudaMalloc(d_A....)

所有操作时，您所做的就是创建一个新的内存分配并将其分配给 d_A。之前分配的内存仍然存在，但是现在你已经丢失了之前内存的指针值，并且没有办法访问它。基于这个和你之前关于几乎同一主题的问题，我是否建议你在尝试 CUDA 之前花一些时间修改 C 中的内存和指针概念，因为除非你对这些基础知识有非常清楚的了解，否则你会发现CUDA 的分布式内存性质非常令人困惑，

Your psuedocode does not "add memory to a previously allocated array" at all. The standard C way of increasing the size of an existing allocation is via the realloc() function, and there is no CUDA equivalent of realloc() at the time of writing.

When you do

cudaMalloc(d_A....)

// something

cudaMalloc(d_A....)

all you are doing is creating a new memory allocation and assigning it to d_A. The previous memory allocation still exists, but now you have lost the pointer value of the previous memory and have no way of accessing it. Based on this and your previous question on almost the same subject, might I suggest you spend a bit of time revising memory and pointer concepts in C before you try CUDA, because unless you have a very clear understanding of these fundamentals, you will find the distributed memory nature of CUDA to be very confusing,

回复收藏 0 原文