为什么数组分配后的堆这么大

发布于 2024-11-25 16:09:46 字数 778 浏览 2 评论 0原文

我有一个非常基本的应用程序,可归结为以下代码:

char* gBigArray[200][200][200];
unsigned int Initialise(){  
    for(int ta=0;ta<200;ta++)
        for(int tb=0;tb<200;tb++)
            for(int tc=0;tc<200;tc++)
                gBigArray[ta][tb][tc]=new char;
    return sizeof(gBigArray);
}

该函数返回 32000000 字节的预期值,大约为 30MB,但在 Windows 任务管理器中(并且它不是 100% 准确)给出了 内存(私有工作集)值约为 157MB。我已通过 SysInternals 将应用程序加载到 VMMap 中,并具有以下值:

我不确定 Image 的含义(在 Type 下列出),尽管与它的值与我的预期无关。真正让我感到困惑的是堆值,它是表面上巨大的大小的来源。

我不明白的是这是为什么?根据这个答案如果我已经正确理解了,gBigArray 将被放置在 data 或 bss 段中 - 但是我猜测,由于每个元素都是未初始化的指针,因此它将被放置在 bss 段中。那么为什么堆值会比所需的值大很多呢?

I've got a very basic application that boils down to the following code:

char* gBigArray[200][200][200];
unsigned int Initialise(){  
    for(int ta=0;ta<200;ta++)
        for(int tb=0;tb<200;tb++)
            for(int tc=0;tc<200;tc++)
                gBigArray[ta][tb][tc]=new char;
    return sizeof(gBigArray);
}

The function returns the expected value of 32000000 bytes, which is approximately 30MB, yet in the Windows Task Manager (and granted it's not 100% accurate) gives a Memory (Private Working Set) value of around 157MB. I've loaded the application into VMMap by SysInternals and have the following values:

I'm unsure what Image means (listed under Type), although irrelevant of that its value is around what I'm expecting. What is really throwing things out for me is the Heap value, which is where the apparent enormous size is coming from.

What I don't understand is why this is? According to this answer if I've understood it correctly, gBigArray would be placed in the data or bss segment - however I'm guessing as each element is an uninitialised pointer it would be placed in the bss segment. Why then would the heap value be larger by a silly amount than what is required?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

无畏 2024-12-02 16:09:46

如果您知道内存分配器的工作原理,这听起来并不愚蠢。它们跟踪分配的块,因此有一个字段存储大小,还有一个指向下一个块的指针,甚至可能是一些填充。一些编译器在调试版本中在分配区域周围放置保护空间,因此,如果您在分配区域之外或之前写入,当您尝试释放分配空间时,程序可以在运行时检测到它。

It doesn't sound silly if you know how memory allocators work. They keep track of the allocated blocks so there's a field storing the size and also a pointer to the next block, perhaps even some padding. Some compilers place guarding space around the allocated area in debug builds so if you write beyond or before the allocated area the program can detect it at runtime when you try to free the allocated space.

牵你的手,一向走下去 2024-12-02 16:09:46

您一次分配一个字符。每次分配通常都会产生空间开销

将内存分配在一个大块上(或至少在几个块上)

you are allocating one char at a time. There is typically a space overhead per allocation

Allocate the memory on one big chunk (or at least in a few chunks)

爱你是孤单的心事 2024-12-02 16:09:46

不要忘记 char* gBigArray[200][200][200]; 为每个字大小的 200*200*200=8000000 指针分配空间。在 32 位系统上即为 32 MB。

添加另一个 8000000 char 以获得另外 8MB 的空间。由于您逐一分配它们,因此可能无法为每个项目分配一个字节,因此它们可能还会占用每个项目的字大小,从而产生另一个 32MB(32 位系统)。

其余的可能是开销,这也很重要,因为 C++ 系统必须记住用 new 分配的数组包含多少个用于 delete [] 的元素。

Do not forget that char* gBigArray[200][200][200]; allocates space for 200*200*200=8000000 pointers, each word size. That is 32 MB on a 32 bit system.

Add another 8000000 char's to that for another 8MB. Since you are allocating them one by one it probably can't allocate them at one byte per item so they'll probably also take the word size per item resulting in another 32MB (32 bit system).

The rest is probably overhead, which is also significant because the C++ system must remember how many elements an array allocated with new contains for delete [].

九八野马 2024-12-02 16:09:46

噢!如果面对这些代码,我的嵌入式系统的东西就会翻滚并死掉。每个分配都有相当多的与之相关的额外信息,并且要么间隔为固定大小,要么通过链接列表类型对象进行管理。在我的系统上,1 个 char new 将成为小对象分配器中的 64 字节分配,这样管理将在 O(1) 时间内完成。但在其他系统中,这很容易使您的内存严重碎片化,使后续的新建和删除运行速度极其缓慢 O(n),其中 n 是它跟踪的事物的数量,并且随着时间的推移,通常会给应用程序带来厄运,因为每个字符都会变成至少 32 字节的分配,并被放置在内存中的各种小房间中,从而将您的分配堆推得比您预期的要远得多。

如果需要使用新的放置或其他指针技巧,请执行单个大型分配并在其上映射 3D 数组。

Owww! My embedded systems stuff would roll over and die if faced with that code. Each allocation has quite a bit of extra info associated with it and either is spaced to a fixed size, or is managed via a linked list type object. On my system, that 1 char new would become a 64 byte allocation out of a small object allocator such that management would be in O(1) time. But in other systems, this could easily fragment your memory horribly, make subsequent new and deletes run extremely slowly O(n) where n is number of things it tracks, and in general bring doom upon an app over time as each char would become at least a 32 byte allocation and be placed in all sorts of cubby holes in memory, thus pushing your allocation heap out much further than you might expect.

Do a single large allocation and map your 3D array over it if you need to with a placement new or other pointer trickery.

雅心素梦 2024-12-02 16:09:46

一次分配 1 个字符可能会更昂贵。每个分配都有元数据标头,因此一个字符的 1 个字节比标头元数据小,因此您实际上可以通过进行一次大型分配(如果可能)来节省空间,这样可以减轻每个具有自己的元数据的单独分配的开销。

Allocating 1 char at a time is probably more expensive. There are metadata headers per allocation so 1 byte for a character is smaller than the header metadata so you might actually save space by doing one large allocation (if possible) that way you mitigate the overhead of each individual allocation having its own metadata.

猥︴琐丶欲为 2024-12-02 16:09:46

也许这是内存步幅的问题?值之间的差距有多大?

Perhaps this is an issue of memory stride? What size of gaps are between values?

来日方长 2024-12-02 16:09:46

30 MB 用于指针。其余部分用于您通过 new 调用分配的存储空间,指针指向。出于各种原因,编译器可以分配多个字节,例如在字边界上对齐,或者提供一些增长空间以备以后需要。如果您需要 8 MB 的字符,请在 gBigArray 声明中保留 *

30 MB is for the pointers. The rest is for the storage you allocated with the new call that the pointers are pointing to. Compilers are allowed to allocate more than one byte for various reasons, like to align on word boundaries, or give some growing room in case you want it later. If you want 8 MB worth of characters, leave the * off your declaration for gBigArray.

淑女气质 2024-12-02 16:09:46

将上述帖子编辑为社区维基帖子

正如下面的答案所说,这里的问题是我创建了一个新字符 200^3 次,尽管每个字符只有 1 个字节,但是堆上每个对象的开销。似乎为所有字符创建一个字符数组会将内存降低到更可信的水平:

char* gBigArray[200][200][200];
char* gCharBlock=new char[200*200*200];
unsigned int Initialise(){  
    unsigned int mIndex=0;
    for(int ta=0;ta<200;ta++)
        for(int tb=0;tb<200;tb++)
            for(int tc=0;tc<200;tc++)
                gBigArray[ta][tb][tc]=&gCharBlock[mIndex++];
    return sizeof(gBigArray);
}

Edited out of the above post into a community wiki post:

As the answers below say, the issue here is I am creating a new char 200^3 times, and although each char is only 1 byte, there is overhead for every object on the heap. It seems creating a char array for all chars knocks the memory down to a more believable level:

char* gBigArray[200][200][200];
char* gCharBlock=new char[200*200*200];
unsigned int Initialise(){  
    unsigned int mIndex=0;
    for(int ta=0;ta<200;ta++)
        for(int tb=0;tb<200;tb++)
            for(int tc=0;tc<200;tc++)
                gBigArray[ta][tb][tc]=&gCharBlock[mIndex++];
    return sizeof(gBigArray);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文