STL数据结构的内存使用,Windows与Linux
我有一个大量使用 std::map 的程序。在 Windows 下,使用的内存比在 Linux 下多得多。有谁知道为什么会发生这种情况?
Linux: 最后一个进程花费了 42.31 秒,使用的内存不超过 909 MB (RSS 900 MB)
Windows: 最后一个进程花费了 75.373 秒,使用的内存不超过 1394 MB (RSS 1395 MB)
我在命令行上使用 gcc 4.4.3 和 VS 2010 C++ 编译器,并带有发布设置。
编辑: 抱歉这么晚才回答问题...
代码如下所示:
enum Symbol {
...
}
class GraphEntry {
public:
...
virtual void setAttribute (Symbol name, Value * value) = 0;
const Value * attribute (Symbol name) const;
private:
std::map<Symbol, Attribute> m_attributes;
};
class Attribute {
public:
Attribute (Symbol name, Value * val);
...
Symbol name () const;
Value * valuePointer () const;
void setValuePointer (Value * p);
private:
Symbol m_name;
Value * m_value;
};
class Graph : public GraphEntry {
...
public:
Node * newNode (...);
Graph * newSubGraph (...);
Edge * newEdge (...);
...
setSomeAttribute (int x);
setSomeOtherAttribute (float f);
...
private:
std::vector<GraphEntry *> m_entries;
};
整个过程描述了一个图结构,它可以在其节点和边上保存一些属性。 Value
只是一个基类,派生类可以保存任意类型的值,例如 int
或 std::string
。
编辑2: 在 Windows 下,我使用以下标志: -DRELEASE -DNDEBUG -DQT_NO_DEBUG -DQT_NO_DEBUG_OUTPUT -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -DNOMINMAX /O2 /MD /Gy /EHsc
编辑 3: 内存使用情况是从 Linux 下的 /proc 文件中读取的(如 memuse
)。 在Windows下,会调用一些WinAPI函数,但我不是这方面的专家,所以我只能说这么多。
编辑4: 使用 /GS-
和 -D_SECURE_SCL
会导致最后一个进程花费了 170.281 秒,并且使用的内存不超过 1391 MB (RSS 1393 MB)
I have a program that heavily uses std::map
. Under Windows, much more memory is used as under Linux. Has anyone an idea why this happens?
Linux:Last process took 42.31 s and used not more than 909 MB (RSS 900 MB) of memory
Windows:Last process took 75.373 s and used not more than 1394 MB (RSS 1395 MB) of memory
I use gcc 4.4.3 and the VS 2010 C++ compiler on the command line, with release settings.
EDIT:
Sorry for answering the questions that late...
The code looks like this:
enum Symbol {
...
}
class GraphEntry {
public:
...
virtual void setAttribute (Symbol name, Value * value) = 0;
const Value * attribute (Symbol name) const;
private:
std::map<Symbol, Attribute> m_attributes;
};
class Attribute {
public:
Attribute (Symbol name, Value * val);
...
Symbol name () const;
Value * valuePointer () const;
void setValuePointer (Value * p);
private:
Symbol m_name;
Value * m_value;
};
class Graph : public GraphEntry {
...
public:
Node * newNode (...);
Graph * newSubGraph (...);
Edge * newEdge (...);
...
setSomeAttribute (int x);
setSomeOtherAttribute (float f);
...
private:
std::vector<GraphEntry *> m_entries;
};
The whole thing describes a graph structure, which can hold some attributes on its nodes and edges. Value
is just a base class, and the derived classes can hold values with arbitrary types, like int
or std::string
.
EDIT 2:
Under Windows, I use the following flags: -DRELEASE -DNDEBUG -DQT_NO_DEBUG -DQT_NO_DEBUG_OUTPUT -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -DNOMINMAX /O2 /MD /Gy /EHsc
EDIT 3:
The memory usage is read from a /proc file under linux (like memuse
).
Under Windows, some WinAPI functions are called, but I am not the expert for this, so that's all what I can say about it.
EDIT 4:
Using /GS-
and -D_SECURE_SCL
results in Last process took 170.281 s and used not more than 1391 MB (RSS 1393 MB) of memory
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
对于 VC++,请尝试使用 /GS- 命令行开关。
For VC++ try to use /GS- command line switch.
当您说使用的内存“不超过”时,您指的是应用程序生命周期内的峰值内存使用量还是平均内存使用量?
确保应用程序使用“new”或“malloc”或任何其他内存分配库调用分配的内存是否使用“delete”或“free”或任何等效的库调用释放回来。
在 Linux 中,您可以使用 valgrind 并检查内存泄漏。
When you say memory used is "not more than" are you referring to the peak memory usage or the mean memory usage during the lifetime of the application?
Make sure if the memory allocated by your application using 'new' or 'malloc' or any other memory allocation library call are released back using 'delete' or 'free' or any equivalent library call.
In Linux, you can use valgrind and check for memory leaks.
您会发现 Windows 上的内存使用量增加了 1 到 2 倍。除了堆算法之外,Windows
malloc()
< /a>,以及随后通过 new 分配到堆上的任何数据结构(例如具有默认分配器类型的 std::map 节点)都对齐到 16字节。在 Linux 上,glibc 默认为 8 字节对齐。假设由于碎片、优化收获未使用的页面等而对差异进行了一些平滑处理,您可以预期差异会变得不那么明显。快速检查您的代码表明映射键和值类型应分别为 4 和 8 字节(
Symbol< /code> 和
属性
)。在 Linux 上这些将四舍五入到 8 字节,在 Windows 上四舍五入到 16 字节。您应该拥有相同数量的映射节点,至少在 MSVC 实现中,这些节点看起来至少消耗 22 个字节,由于其成员对齐规则(这也是其 malloc 粒度),MSVC 将扩展到 32 个字节。 GCC 将其扩展到 24,这意味着 MSVC 中的总共大约 48 个字节到 GCC/Linux 的每个节点 32 个字节。 Windows 上的内存使用量增加大约 50%。这是 MSVC 中使用的节点结构,如果您感兴趣,我可以查找 GCC 等效项:
对于那些不熟悉内存使用工作原理的人,我会补充一点,有几个因素在起作用:
malloc()
对齐规则(除非您破坏通常的堆或使用默认分配器之外的其他分配器)。You'll observe that memory usage on Windows is somewhere between 1 and 2 times greater. Heap algorithms aside, Windows
malloc()
, and subsequently any data structures allocated on the heap vianew
(such asstd::map
's nodes with the default allocator type), are aligned to 16 bytes. On Linux, glibc defaults to 8 byte alignment. Assuming some smoothing in differences due to fragmentation, optimization reaping of unused pages etc. you can expect the differences to become less apparentA quick check of your code indicates map key and value types should be 4 and 8 bytes respectively (
Symbol
andAttribute
). These will round up to 8 bytes on Linux, and 16 bytes on Windows. You should have an equal number of map nodes, at least in the MSVC implementation, these look to consume a minimum of 22 bytes, which MSVC will expand to 32 due to its member alignment rules, which is also its malloc granularity. GCC will expand its to 24, meaning an approximate total of 48 bytes in MSVC to GCC/Linux' 32 per node. Roughly 50% more memory usage on Windows.Here's the node structure used in MSVC, I can look up the GCC equivalent if you are interested:
I'll add for those who are unfamiliar with how memory usage works, there are several factors at play:
malloc()
alignment rules are in use (unless you subvert the usual heap or use some other allocator than the default).每个编译器都附带了自己的 STL 实现,因此您要进行比较:
在这里进行有意义的比较非常困难,因为您不知道分配例程或分配例程中的哪一个STL 实现(或可能两者)实际上负责。
我确实认为您没有将 32 位程序与 64 位程序进行比较,因为这更没有意义。
Each compiler is shipped with its own implementation of the STL, therefore you are comparing:
It's quite difficult to draw a meaningful comparison here because you don't know which of the allocation routine or STL implementation (or possibly both) is actually responsible.
I do suppose that you are not comparing a 32-bits program with a 64-bits program, since this would be even less meaningful.
某些版本的 VC++ 在发布版本中也使用检查迭代器 (_SECURE_SCL)。 VC2005 和 VC2008 默认情况下打开它们。
VC2010 默认情况下禁用它们
根据您的编译器,这可能是另一回事检查(并关闭)。
Some versions of VC++ use checked iterators (_SECURE_SCL) in release builds, too. VC2005 and VC2008 have them turned on by default.
VC2010 disables them by default
Depending on your compiler, that could be another thing to check (and turn off).
你是在windows下以release还是debug模式执行测试的?调试模式下的STL会做很多额外的检查;也许它还使用更多内存来执行所有检查。
Did you execute the test in release or debug mode under windows? STL in debug mode does a lot of extra checking; maybe it also uses more memory to be able to perform all checks.