boost::serialization 序列化期间内存消耗较高

发布于 2024-10-09 23:52:46 字数 1051 浏览 2 评论 0原文


正如主题所示,在将大量数据序列化到文件时,我遇到了 boost::serialization 的一个小问题。问题在于应用程序序列化部分的内存占用量大约是要序列化的对象内存的 3 到 3.5 倍。
值得注意的是,我拥有的数据结构是基类指针和指向该结构的指针的三维向量。像这样:

using namespace std;    
vector<vector<vector<MyBase*> > >* data;

稍后使用与此类似的代码进行序列化:

ar & BOOST_SERIALIZATION_NVP(data);

包含 boost/serialization/vector.hpp。

被序列化的类全部继承自“MyBase”。
现在,自从我的项目开始以来,我使用了不同的档案进行序列化,从典型的binary_archive、文本、xml 到最后的多态二进制/xml/文本。其中每一个的行为方式都完全相同。

通常,如果我必须序列化少量数据,那么这不会成为问题,但我拥有的类数量为数百万(理想情况下约为 1000 万),并且我能够测试的内存使用情况始终表明写入文件时,代码的 boost::serialization 部分分配的内存大约占应用程序整个内存占用量的 2/3。

这相当于 400 万个对象占用约 13.5 GB 的 RAM,其中对象本身占用 4.2 GB。现在这就是我所能获取的代码,因为我无法访问物理 RAM 超过 8GB 的​​机器。我还应该注意到,这是一个在 Windows 7 专业版 x64 版本上运行的 64 位应用程序,但在 Ubuntu 机器上情况类似。

任何人都知道我将如何解决此问题,因为对于我来说,对一个应用程序有如此高的内存要求是不可接受的,而该应用程序在运行时不会像序列化时那样使用那么多内存。

反序列化并没有那么糟糕,因为它分配的内存大约是所需内存的 1.5 倍。这是我可以忍受的。

尝试使用 boost::archive::archive_flags::no_tracking 关闭跟踪,但其行为完全相同。

有人知道我应该做什么吗?

just as the topic suggests I've come across a slight issue with boost::serialization when serializing a huge amount of data to a file. The problem consists of the memory footprint of the serialization part of the application taking around 3 to 3.5 times the memory of my objects being serialized.
It is important to note that the data structure I have is a three dimensional vector of base class pointers and a pointer to that structure. Like this:

using namespace std;    
vector<vector<vector<MyBase*> > >* data;

This is later serialised with a code analog to this one:

ar & BOOST_SERIALIZATION_NVP(data);

boost/serialization/vector.hpp is included.

Classes being serialised all inherit from "MyBase".
Now, since the start of my project I've used different archives for serialization from typical binary_archive, text, xml and finally polymorphic binary/xml/text. Every single one of these acts exactly the same way.

Typically this wouldn't be a problem if I had to serialize small amounts of data but the number of classes I have are in the milions (ideally around 10 milion) and the memory usage as I've been able to test it shows consistently that the memory allocated by boost::serialization part of the code is around 2/3 of the application whole memory footprint while writing the file.

This amounts to around 13.5 GB of RAM taken for 4 milion objects where the objects themselves take 4.2GB. Now this is as far as I've been able to take my code since I don't have access to a machine with more than 8GB of physical RAM. I should also note that this is a 64bit application being run on a Windows 7 professional x64 edition but the situation is similar on an Ubuntu box.

Anyone has any idea how I would go about troubleshooting this as it is unacceptable for me to have such high memory requirements for an application that will not use as much memory while running as it does while serializing.

Deserialization isn't as bad, as it allocates around 1.5 times the needed memory. This is something I could live with.

Tried turning tracking off with boost::archive::archive_flags::no_tracking but it acts exactly the same.

Anyone have any idea what I should do?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

对你的占有欲 2024-10-16 23:52:46

使用 valgrind 我发现内存消耗的主要原因是库内用于跟踪指针的映射。如果您确定不需要指针跟踪(这意味着您确定没有指针别名),请禁用跟踪。您可以在此处找到主要内容禁用跟踪的概念。简而言之,您必须执行以下操作:

BOOST_CLASS_TRACKING(vector<vector<vector<MyBase*> > >, boost::serialization::track_never)

我的问题 我编写了这个宏的一个版本,您可以禁用对模板类的跟踪。这一定会对您的内存消耗产生重大影响。
另请注意,任何容器内都有指针,如果您不想跟踪,则也必须禁用对它们的跟踪。目前我找不到任何方法可以正确执行此操作。

Using valgrind I found that the main reason of memory consumption is a map inside the library to track pointers. If you are certain that you do not need pointer tracking ( it means you are sure that there is no pointer aliasing) disable tracking. You can find here the main concepts of disable tracking. In short you must do something like this:

BOOST_CLASS_TRACKING(vector<vector<vector<MyBase*> > >, boost::serialization::track_never)

In my question I wrote a version of this macro that you could disable tracking of a template class. This must have a significant impact on your memory consumption.
Also notice that there are pointers inside any containers If you want tracking never you must disable tracking of them too. Currently I could not find any way to do this properly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文