std::类似向量的类经过优化以容纳少量项目
在程序的一个时间关键部分中,有一个类成员如下所示: std::vector m_vLinks; 在分析过程中,我注意到该向量大约 99.98% 的执行仅包含 0 或 1 个项目。 然而,在极少数情况下,它可能会容纳更多。 根据探查器,这个向量绝对是一个瓶颈,所以我正在考虑以下优化:
- 制作一个具有类似向量接口的手工类
- 这个类将保存真实大小、一项和指向向量的可选指针
- 在这种情况下,当向量保存 1 个项目,不会有任何动态内存分配,而且由于删除了一个间接寻址,访问该项目也会(稍微)快一些。
- 当我们需要保存更多数据时,向量是动态分配的
- 当然这个向量不会提供一个内存块来保存所有项目(这里不需要),而且一些操作会更复杂
在开始原型化这个东西之前看看它是否有帮助,不知道有没有人在某些3rd-party库中遇到过具有类似功能的自定义容器?
我已经考虑过 boost::array,但不希望它施加大小限制
In one time-critical part of the program there is a member of the class that looks like that:
std::vector m_vLinks;
During profiling I noticed that about 99.98% of executions this vector holds only 0 or 1 items.
However in very rarely cases it might hold more.
This vector is definitely a bottleneck according to profiler, so I'm thinking about following optimization:
- Craft a hand-made class with vector-like interface
- This class will hold true size, one item and optional pointer to the vector
- In this case when vector holds 1 item there won't be any dynamic memory allocations, and also accessing this item will be (a bit) faster due to removing of one indirection.
- When we need to hold more data vector is dynamically allocated
- Of course this vector won't provide one memory block holding all items (not needed here), and also some operations will be more complex
Before starting to prototype this thing to see if it helps, I wonder if anyone encountered custom containers with similar functionality in some 3rd-party libraries?
I already thought about boost::array, but don't want size limit that it imposes
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
LLVM 有一个名为 SmallVector 的类。
LLVM has a class for that called SmallVector.
在代码的非时间关键部分中,执行:
m_vLinks.reserve(1);
。这样,在时间关键的部分,通常不会有动态分配。In a non-time-critical part of your code, perform:
m_vLinks.reserve(1);
. That way, in the time-critical part, there will typically be no dynamic allocation.我的第一个尝试是优化内存分配器。简单的
malloc
实现效率不太高,您可能需要尝试tcmalloc
或jemalloc
。我的第二次尝试是更改分配器。 Howard Hinnant 演示了如何使用在堆栈上预分配一些内存的有状态分配器。这仅符合 C++11 标准,但可能已受支持。
我的第三次尝试是更改代码并在可能的情况下预先分配内存。您可以保留它,而不是每次重新构建
向量
:它的容量不会减少,因此后续使用不会分配内存。自制程序实现与 std::vector类的速度相匹配的可能性很小,因为它的许多方法都已针对最佳性能进行了调整。
My first attempt would be to optimize the memory allocator. Naive
malloc
implementations are not too efficient, you may want to trytcmalloc
orjemalloc
.My second attempt would be to change the allocator. Howard Hinnant has demonstrated how to use a stateful allocator that has some memory preallocated on the stack. This is only Standard compliant in C++11, but may already be supported.
My third attempt would be to change the code and pre-allocate the memory if possible. Instead of building the
vector
anew each time, you could keep it around: its capacity will not diminish and so subsequent uses won't allocate memory.There are few chances that a homebrew implementation would match the speed of the
std::vector<T>
classes, as many of its methods have been tuned for maximum performance.对于这些情况,我通常使用 std::list 。当 N==1 时,
std::list
中的 O(N) 操作不会对我造成伤害。I generally use
std::list
for these cases. The O(N) operations instd::list
don't hurt me when N==1.