类中的内存使用量 - 将 double 转换为 float 并没有按预期减少内存使用量

发布于 2024-11-27 08:54:03 字数 621 浏览 1 评论 0原文

我正在初始化数百万个以下类型的类

template<class T>
struct node
{
  //some functions
private:
  T m_data_1;
  T m_data_2;
  T m_data_3;

  node* m_parent_1;
  node* m_parent_2;
  node* m_child;
}

模板的目的是使用户能够选择floatdouble精度,其想法是通过< code>node将占用更少的内存 (RAM)。

但是,当我从 double 切换到 float 时,程序的内存占用量并没有像我预期的那样减少。我有两个问题,

  1. 编译器/操作系统是否可能保留比我的浮点所需的空间更多的空间(或者甚至将它们存储为双精度)。如果是这样,我该如何阻止这种情况发生 - 我在 64 位机器上使用 linux 和 g++。

  2. 有没有一种工具可以让我确定所有不同类使用的内存量? (即某种内存分析) - 确保内存不会被占用到我没有想到的其他地方。

I am initializing millions of classes that are of the following type

template<class T>
struct node
{
  //some functions
private:
  T m_data_1;
  T m_data_2;
  T m_data_3;

  node* m_parent_1;
  node* m_parent_2;
  node* m_child;
}

The purpose of the template is to enable the user to choose float or double precision, with the idea being that by node<float> will occupy less memory (RAM).

However, when I switch from double to float the memory footprint of my program does not decrease as I expect it to. I have two questions,

  1. Is it possible that the compiler/operating system is reserving more space than required for my floats (or even storing them as a double). If so, how do I stop this happening - I'm using linux on 64 bit machine with g++.

  2. Is there a tool that lets me determine the amount of memory used by all the different classes? (i.e. some sort of memory profiling) - to make sure that the memory isn't being goobled up somewhere else that I haven't thought of.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

朮生 2024-12-04 08:54:03

如果您针对 64 位进行编译,则每个指针的大小将为 64 位。这也意味着它们可能需要与 64 位对齐。因此,如果您存储 3 个浮点数,则可能需要插入 4 个字节的填充。因此,您只节省了 8 个字节,而不是 12 个字节。无论指针位于结构的开头还是结尾,填充仍然存在。为了将连续的结构放入数组中以继续保持对齐,这是必要的。

此外,您的结构主要由 3 个指针组成。您节省的 8 个字节会将您从 48 字节对象变为 40 字节对象。这并不是一个巨大的下降。同样,如果您正在编译 64 位。

如果您针对 32 位进行编译,那么您将从 36 字节结构中节省 12 个字节,从百分比角度来看,这是更好的选择。如果双精度数必须对齐到 8 个字节,则可能会更多。

If you are compiling for 64-bit, then each pointer will be 64-bits in size. This also means that they may need to be aligned to 64-bits. So if you store 3 floats, it may have to insert 4 bytes of padding. So instead of saving 12 bytes, you only save 8. The padding will still be there whether the pointers are at the beginning of the struct or the end. This is necessary in order to put consecutive structs in arrays to continue to maintain alignment.

Also, your structure is primarily composed of 3 pointers. The 8 bytes you save take you from a 48-byte object to a 40 byte object. That's not exactly a massive decrease. Again, if you're compiling for 64-bit.

If you're compiling for 32-bit, then you're saving 12 bytes from a 36-byte structure, which is better percentage-wise. Potentially more if doubles have to be aligned to 8 bytes.

栩栩如生 2024-12-04 08:54:03

关于差异的来源,其他答案是正确的。但是,x86/x86-64 上的指针(和其他类型)不需要对齐。只是当它们保持一致时性能会更好,这就是为什么 GCC 默认保持它们对齐。

但是 GCC 提供了一个“打包”属性来让您对此进行控制:

#include <iostream>

template<class T>
struct node
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node* m_parent_1;
    node* m_parent_2;
    node* m_child;
}    ;

template<class T>
struct node2
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node2* m_parent_1;
    node2* m_parent_2;
    node2* m_child;
} __attribute__((packed));

int
main(int argc, char *argv[])
{
    std::cout << "sizeof(node<double>) == " << sizeof(node<double>) << std::endl;
    std::cout << "sizeof(node<float>) == " << sizeof(node<float>) << std::endl;
    std::cout << "sizeof(node2<float>) == " << sizeof(node2<float>) << std::endl;
    return 0;
}

在我的系统(x86-64,g++ 4.5.2)上,该程序输出:

sizeof(node<double>) == 48
sizeof(node<float>) == 40
sizeof(node2<float>) == 36

当然,“属性”机制和“打包”属性本身是GCC 特定的。

The other answers are correct about the source of the discrepancy. However, pointers (and other types) on x86/x86-64 are not required to be aligned. It is just that performance is better when they are, which is why GCC keeps them aligned by default.

But GCC provides a "packed" attribute to let you exert control over this:

#include <iostream>

template<class T>
struct node
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node* m_parent_1;
    node* m_parent_2;
    node* m_child;
}    ;

template<class T>
struct node2
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node2* m_parent_1;
    node2* m_parent_2;
    node2* m_child;
} __attribute__((packed));

int
main(int argc, char *argv[])
{
    std::cout << "sizeof(node<double>) == " << sizeof(node<double>) << std::endl;
    std::cout << "sizeof(node<float>) == " << sizeof(node<float>) << std::endl;
    std::cout << "sizeof(node2<float>) == " << sizeof(node2<float>) << std::endl;
    return 0;
}

On my system (x86-64, g++ 4.5.2), this program outputs:

sizeof(node<double>) == 48
sizeof(node<float>) == 40
sizeof(node2<float>) == 36

Of course, the "attribute" mechanism and the "packed" attribute itself are GCC-specific.

想你的星星会说话 2024-12-04 08:54:03

除了 Nicol 提出的有效观点之外:

当您调用 new/malloc 时,它不一定与调用操作系统分配内存一一对应。这是因为,为了减少昂贵的系统调用数量,堆管理器可能会分配比请求更多的内存,然后在调用 new/malloc 时“重新分配”其中的块。此外,内存一次只能分配 4kb(通常 - 这是最小页面大小)。本质上,可能存在当前未主动使用的已分配内存块,以加快将来的分配速度。

直接回答你的问题:

1)是的,运行时很可能会分配比你要求的更多的内存 - 但这些内存并没有浪费,它将用于将来的新闻/malloc,但仍会显示在“任务管理器”中或您使用的任何工具。不,它不会将浮点数提升为双打。您进行的分配越多,这种边缘条件就越不可能成为大小差异的原因,并且 Nicol 中的项目将占主导地位。对于较少数量的分配,此项可能占主导地位(其中“大”和“小”完全取决于您的操作系统和内核)。

2) Windows 任务管理器将为您提供分配的总内存。像 WinDbg 这样的东西实际上会给你运行时分配的虚拟内存范围块(通常在树中分配)。对于 Linux,我希望这些数据可以在与您的进程关联的 /proc 目录中的文件之一中找到。

In addtion to the valid points that Nicol makes:

When you call new/malloc, it doesn't necessarily correspond 1 to 1 with a call the the OS to allocate memory. This is because in order to reduce the number of expensive syste, calls, the heap manager may allocate more than is requested, and then "suballocate" chunks of that when you call new/malloc. Also, memory can only be allocated 4kb at a time (typically - this is the minimum page size). Essentially, there may be chunks of memory allocated that are not currently actively used, in order to speed up future allocations.

To answer your questions directly:

1) Yes, the runtime will very likely allocate more memory then you asked for - but this memory is not wasted, it will be used for future news/mallocs, but will still show up in "task manager" or whatever tool you use. No, it will not promote floats to doubles. The more allocations you make, the less likely this edge condition will be the cause of the size difference, and the items in Nicol's will dominate. For a smaller number of allocations, this item is likely to dominate (where "large" and "small" depends entirely on your OS and Kernel).

2) The windows task manager will give you the total memory allocated. Something like WinDbg will actually give you the virtual memory range chunks (usually allocated in a tree) that were allocated by the run-time. For Linux, I expect this data will be available in one of the files in the /proc directory associated with your process.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文