类中的内存使用量 - 将 double 转换为 float 并没有按预期减少内存使用量

发布于 2024-11-27 08:54:03 字数 621 浏览 5 评论 0原文

我正在初始化数百万个以下类型的类

template<class T>
struct node
{
  //some functions
private:
  T m_data_1;
  T m_data_2;
  T m_data_3;

  node* m_parent_1;
  node* m_parent_2;
  node* m_child;
}

模板的目的是使用户能够选择float或double精度，其想法是通过< code>node将占用更少的内存 (RAM)。

但是，当我从 double 切换到 float 时，程序的内存占用量并没有像我预期的那样减少。我有两个问题，

编译器/操作系统是否可能保留比我的浮点所需的空间更多的空间（或者甚至将它们存储为双精度）。如果是这样，我该如何阻止这种情况发生 - 我在 64 位机器上使用 linux 和 g++。
有没有一种工具可以让我确定所有不同类使用的内存量？（即某种内存分析） - 确保内存不会被占用到我没有想到的其他地方。

原文

I am initializing millions of classes that are of the following type

template<class T>
struct node
{
  //some functions
private:
  T m_data_1;
  T m_data_2;
  T m_data_3;

  node* m_parent_1;
  node* m_parent_2;
  node* m_child;
}

The purpose of the template is to enable the user to choose float or double precision, with the idea being that by node<float> will occupy less memory (RAM).

However, when I switch from double to float the memory footprint of my program does not decrease as I expect it to. I have two questions,

Is it possible that the compiler/operating system is reserving more space than required for my floats (or even storing them as a double). If so, how do I stop this happening - I'm using linux on 64 bit machine with g++.
Is there a tool that lets me determine the amount of memory used by all the different classes? (i.e. some sort of memory profiling) - to make sure that the memory isn't being goobled up somewhere else that I haven't thought of.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

朮生 2024-12-04 08:54:03

如果您针对 64 位进行编译，则每个指针的大小将为 64 位。这也意味着它们可能需要与 64 位对齐。因此，如果您存储 3 个浮点数，则可能需要插入 4 个字节的填充。因此，您只节省了 8 个字节，而不是 12 个字节。无论指针位于结构的开头还是结尾，填充仍然存在。为了将连续的结构放入数组中以继续保持对齐，这是必要的。

此外，您的结构主要由 3 个指针组成。您节省的 8 个字节会将您从 48 字节对象变为 40 字节对象。这并不是一个巨大的下降。同样，如果您正在编译 64 位。

如果您针对 32 位进行编译，那么您将从 36 字节结构中节省 12 个字节，从百分比角度来看，这是更好的选择。如果双精度数必须对齐到 8 个字节，则可能会更多。

回复收藏 0 原文

栩栩如生 2024-12-04 08:54:03

关于差异的来源，其他答案是正确的。但是，x86/x86-64 上的指针（和其他类型）不需要对齐。只是当它们保持一致时性能会更好，这就是为什么 GCC 默认保持它们对齐。

但是 GCC 提供了一个“打包”属性来让您对此进行控制：

#include <iostream>

template<class T>
struct node
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node* m_parent_1;
    node* m_parent_2;
    node* m_child;
}    ;

template<class T>
struct node2
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node2* m_parent_1;
    node2* m_parent_2;
    node2* m_child;
} __attribute__((packed));

int
main(int argc, char *argv[])
{
    std::cout << "sizeof(node<double>) == " << sizeof(node<double>) << std::endl;
    std::cout << "sizeof(node<float>) == " << sizeof(node<float>) << std::endl;
    std::cout << "sizeof(node2<float>) == " << sizeof(node2<float>) << std::endl;
    return 0;
}

在我的系统（x86-64，g++ 4.5.2）上，该程序输出：

sizeof(node<double>) == 48
sizeof(node<float>) == 40
sizeof(node2<float>) == 36

当然，“属性”机制和“打包”属性本身是GCC 特定的。

The other answers are correct about the source of the discrepancy. However, pointers (and other types) on x86/x86-64 are not required to be aligned. It is just that performance is better when they are, which is why GCC keeps them aligned by default.

But GCC provides a "packed" attribute to let you exert control over this:

#include <iostream>

template<class T>
struct node
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node* m_parent_1;
    node* m_parent_2;
    node* m_child;
}    ;

template<class T>
struct node2
{
private:
    T m_data_1;
    T m_data_2;
    T m_data_3;

    node2* m_parent_1;
    node2* m_parent_2;
    node2* m_child;
} __attribute__((packed));

int
main(int argc, char *argv[])
{
    std::cout << "sizeof(node<double>) == " << sizeof(node<double>) << std::endl;
    std::cout << "sizeof(node<float>) == " << sizeof(node<float>) << std::endl;
    std::cout << "sizeof(node2<float>) == " << sizeof(node2<float>) << std::endl;
    return 0;
}

On my system (x86-64, g++ 4.5.2), this program outputs:

sizeof(node<double>) == 48
sizeof(node<float>) == 40
sizeof(node2<float>) == 36

Of course, the "attribute" mechanism and the "packed" attribute itself are GCC-specific.

回复收藏 0 原文

想你的星星会说话 2024-12-04 08:54:03

除了 Nicol 提出的有效观点之外：

当您调用 new/malloc 时，它不一定与调用操作系统分配内存一一对应。这是因为，为了减少昂贵的系统调用数量，堆管理器可能会分配比请求更多的内存，然后在调用 new/malloc 时“重新分配”其中的块。此外，内存一次只能分配 4kb（通常 - 这是最小页面大小）。本质上，可能存在当前未主动使用的已分配内存块，以加快将来的分配速度。

直接回答你的问题：

1）是的，运行时很可能会分配比你要求的更多的内存 - 但这些内存并没有浪费，它将用于将来的新闻/malloc，但仍会显示在“任务管理器”中或您使用的任何工具。不，它不会将浮点数提升为双打。您进行的分配越多，这种边缘条件就越不可能成为大小差异的原因，并且 Nicol 中的项目将占主导地位。对于较少数量的分配，此项可能占主导地位（其中“大”和“小”完全取决于您的操作系统和内核）。

2) Windows 任务管理器将为您提供分配的总内存。像 WinDbg 这样的东西实际上会给你运行时分配的虚拟内存范围块（通常在树中分配）。对于 Linux，我希望这些数据可以在与您的进程关联的 /proc 目录中的文件之一中找到。

回复收藏 0 原文

~没有更多了~