应该使用插入排序还是构建堆来提高性能?

发布于 2024-07-27 14:58:45 字数 1295 浏览 2 评论 0原文

我们有大量(100,000+ 个元素)的有序结构向量(运算符 < 重载以提供排序):

std::vector < MyType > vectorMyTypes;
std::sort(vectorMyType.begin(), vectorMyType.end());

我的问题是,在向这些向量添加新元素同时保留排序顺序时,我们遇到了性能问题。 目前我们正在做类似的事情:

for ( a very large set )
{
    vectorMyTypes.push_back(newType);
    std::sort(vectorMyType.begin(), vectorMyType.end());

    ...

    ValidateStuff(vectorMyType); // this method expects the vector to be ordered
}

这并不是我们的代码的样子,因为我知道这个示例可以通过不同的方式进行优化,但是它让您了解如何提高性能这是一个问题,因为我在每次 push_back 之后进行排序。

我认为我基本上有两个选择来提高性能:

  1. 使用(手工制作的?)插入排序而不是std::sort来提高排序性能(插入对部分排序的向量进行排序的速度快得令人眼花缭乱)

  2. 使用 std::make_heapstd::push_heap 创建堆来维护排序顺序

我的问题是:

  • 我应该实现插入排序吗? Boost 中有什么可以帮助我的吗?

  • 我应该考虑使用堆吗? 我该怎么做?


编辑:

感谢您的所有回复。 我知道我给出的示例远非最佳,它并不能完全代表我现在代码中的内容。 它只是为了说明我遇到的性能瓶颈 - 也许这就是为什么这个问题没有看到很多赞成票:)

非常感谢你 史蒂夫,通常最简单的答案就是最好的,也许是我对问题的过度分析让我看不到也许是最明显的解决方案。 我确实喜欢您概述的直接插入到预先排序的向量中的简洁方法。

正如我所评论的,我现在只能使用向量,所以 std::set、std::map 等不是一个选项。

We have large (100,000+ elements) ordered vectors of structs (operator < overloaded to provide ordering):

std::vector < MyType > vectorMyTypes;
std::sort(vectorMyType.begin(), vectorMyType.end());

My problem is that we're seeing performance problems when adding new elements to these vectors while preserving sort order. At the moment we're doing something like:

for ( a very large set )
{
    vectorMyTypes.push_back(newType);
    std::sort(vectorMyType.begin(), vectorMyType.end());

    ...

    ValidateStuff(vectorMyType); // this method expects the vector to be ordered
}

This isn't exactly what our code looks like since I know this example could be optimised in different ways, however it gives you an idea of how performance could be a problem because I'm sorting after every push_back.

I think I essentially have two options to improve performance:

  1. Use a (hand crafted?) insertion sort instead of std::sort to improve the sort performance (insertion sorts on a partially sorted vector are blindingly quick)

  2. Create a heap by using std::make_heap and std::push_heap to maintain the sort order

My questions are:

  • Should I implement an insertion sort? Is there something in Boost that could help me here?

  • Should I consider using a heap? How would I do this?


Edit:

Thanks for all your responses. I understand that the example I gave was far from optimal and it doesn't fully represent what I have in my code right now. It was simply there to illustrate the performance bottleneck I was experiencing - perhaps that's why this question isn't seeing many up-votes :)

Many thanks to you Steve, it's often the simplest answers that are the best, and perhaps it was my over analysis of the problem that blinded me to perhaps the most obvious solution. I do like the neat method you outlined to insert directly into a pre-ordered vector.

As I've commented, I'm constrained to using vectors right now, so std::set, std::map, etc aren't an option.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

浊酒尽余欢 2024-08-03 14:58:45

有序插入不需要 boost:

vectorMyTypes.insert(
    std::upper_bound(vectorMyTypes.begin(), vectorMyTypes.end(), newType),
    newType);

upper_bound 提供了一个有效的插入点,前提是向量已排序,因此只要您只将元素插入到正确的位置,就完成了。 我最初说的是lower_bound,但是如果向量包含多个相等的元素,那么upper_bound会选择需要较少工作的插入点。

这确实需要复制 O(n) 个元素,但是你说插入排序“快得令人眼花缭乱”,而这更快。 如果不够快,你必须找到一种方法来批量添加项目并在最后验证,否则放弃连续存储并切换到维护顺序的容器,例如 set多重集

堆不维护底层容器中的顺序,但对于优先级队列或类似队列很有用,因为它可以快速删除最大元素。 您说您想要按顺序维护向量,但如果您从未真正按顺序迭代整个集合,那么您可能不需要它完全排序,这就是堆有用的时候。

Ordered insertion doesn't need boost:

vectorMyTypes.insert(
    std::upper_bound(vectorMyTypes.begin(), vectorMyTypes.end(), newType),
    newType);

upper_bound provides a valid insertion point provided that the vector is sorted to start with, so as long as you only ever insert elements in their correct place, you're done. I originally said lower_bound, but if the vector contains multiple equal elements, then upper_bound selects the insertion point which requires less work.

This does have to copy O(n) elements, but you say insertion sort is "blindingly fast", and this is faster. If it's not fast enough, you have to find a way to add items in batches and validate at the end, or else give up on contiguous storage and switch to a container which maintains order, such as set or multiset.

A heap does not maintain order in the underlying container, but is good for a priority queue or similar, because it makes removal of the maximum element fast. You say you want to maintain the vector in order, but if you never actually iterate over the whole collection in order then you might not need it to be fully ordered, and that's when a heap is useful.

睫毛溺水了 2024-08-03 14:58:45

根据 Meyers'Effective STL 的第 23 条,如果您的应用程序分 3 个阶段使用其数据结构,则应该使用排序向量。 从书中,他们是:

  1. 设置。 通过向其中插入大量元素来创建新的数据结构。 在这个阶段,几乎所有的操作都是插入和擦除。 查找很少甚至不存在
  2. 查找。 查阅数据结构以查找特定信息。 在这个阶段,几乎所有的操作都是查找。 插入和擦除很少或根本不存在。 查找次数太多,这个阶段的性能使得其他阶段的性能变得次要。
  3. 重新组织。修改数据结构的内容。 也许通过擦除所有当前数据并在其位置插入新数据。 从行为上来说,此阶段相当于第 1 阶段。此阶段完成后,应用程序将返回到第 2 阶段

如果您对数据结构的使用与此类似,则应使用排序向量,然后使用提到的 binary_search。 如果没有,典型的关联容器应该可以做到这一点,这意味着集合、多集、映射或多映射,因为这些结构默认排序

According to item 23 of Meyers' Effective STL, you should use a sorted vector if you application use its data structures in 3 phases. From the book, they are :

  1. Setup. Create a new data structure by inserting lots of elements into it. During this phase, almost all operation are insertions and erasure. Lookups are rare on nonexistent
  2. Lookup. Consult the data structure to find specific pieces of information. During this phase, almost all operations are lookups. Insertion and erasures are rare or nonexistent. There are so many lookups, the performance of this phase makes the performance of the other phases incidental.
  3. Reorganize. Modify the content of the data structure. perhaps by erasing all the current data and inserting new data in its place. Behaviorally, this phase is equivalent to phase 1. Once this phase is completed, the application return to phase 2

If your use of your data structure resembles this, you should use a sorted vector, and then use a binary_search as mentionned. If not, a typical associative container should do it, that means a set, multi-set, map or multimap as those structure are ordered by default

零時差 2024-08-03 14:58:45

为什么不直接使用二分搜索来查找插入新元素的位置呢? 然后您将准确地插入所需的位置。

Why not just use a binary search to find where to insert the new element? Then you will insert exactly into the required position.

数理化全能战士 2024-08-03 14:58:45

如果您需要将大量元素插入到已排序的序列中,请使用 std::merge,可能会首先对新元素进行排序:

void add( std::vector<Foo> & oldFoos, const std::vector<Foo> & newFoos ) {
    std::vector<Foo> merged;
    // precondition: oldFoos _and newFoos_ are sorted
    merged.reserve( oldFoos.size() + newFoos.size() ); // only for std::vector
    std::merge( oldFoos.begin(), oldFoos.end(),
                newFoos.begin(), newFoos.end(),
                std::back_inserter( merged );
    // apply std::unique, if wanted, here
    merged.erase( std::unique( merged.begin(), merged.end() ), merged.end() );
    oldFoos.swap( merged ); // commit changes
}

If you need to insert a lot of elements into a sorted sequence, use std::merge, potentially sorting the new elements first:

void add( std::vector<Foo> & oldFoos, const std::vector<Foo> & newFoos ) {
    std::vector<Foo> merged;
    // precondition: oldFoos _and newFoos_ are sorted
    merged.reserve( oldFoos.size() + newFoos.size() ); // only for std::vector
    std::merge( oldFoos.begin(), oldFoos.end(),
                newFoos.begin(), newFoos.end(),
                std::back_inserter( merged );
    // apply std::unique, if wanted, here
    merged.erase( std::unique( merged.begin(), merged.end() ), merged.end() );
    oldFoos.swap( merged ); // commit changes
}
千纸鹤 2024-08-03 14:58:45

使用二分搜索来查找插入位置不会大大加快算法速度,因为执行插入操作的时间复杂度仍然是 O(N)(考虑在向量的开头插入 - 您必须将每个元素向下移动一个)来创造空间)。

树(又名堆)的插入时间复杂度为 O(log(N)),性能更好。

请参阅http://www.sgi.com/tech/stl/priority_queue.html

请注意,除非树是平衡的,否则插入树在最坏情况下仍将具有 O(N) 性能,例如 AVL 树。

Using a binary search to find the insertion location isn't going to speed up the algorithm much because it will still be O(N) to do the insertion (consider inserting at the beginning of a vector - you have to move every element down one to create the space).

A tree (aka heap) will be O(log(N)) to insert, much better performance.

See http://www.sgi.com/tech/stl/priority_queue.html

Note that a tree will still have worst case O(N) performance for insert unless it is balanced, e.g. an AVL tree.

懒的傷心 2024-08-03 14:58:45

为什么不使用 boost::multi_index

注意:boost::multi_index 不提供内存连续性,这是 std::vectors 的一种属性,通过该属性,元素在单个内存块中彼此相邻存储。

Why not to use boost::multi_index ?

NOTE: boost::multi_index does not provide memory contiguity, a property of std::vectors by which elements are stored adjacent to one another in a single block of memory.

若沐 2024-08-03 14:58:45

您需要做一些事情。

  1. 您可能需要考虑使用 reserve() 来避免整个向量的过度重新分配。 如果您知道它将增长到的大小,您可以通过自己执行 resrve() 来获得一些性能(而不是让实现使用内置启发式自动执行它们)。

  2. 进行二分查找来找到插入位置。 然后调整大小并将插入点后面的所有内容向上移动一位以腾出空间。

  3. 考虑一下:你真的想使用向量吗? 也许setmap更好。

二分搜索相对于 lower_bound 的优势在于,如果插入点接近向量的末尾,则无需付出 theta(n) 复杂度。

There are a few things you need to do.

  1. You may want to consider making use of reserve() to avoid excessive re-allocing of the entire vector. If you have knowledge of the size it will grow to, you may gain some performance by doing resrve()s yourself (rather than having the implemetation do them automaticaly using the built in heuristic).

  2. Do a binary search to find the insertion location. Then resize and shift everything following the insertion point up by one to make room.

  3. Consider: do you really want to use a vector? Perhaps a set or map are better.

The advantage of binary search over lower_bound is that if the insertion point is close to the end of the vector you don't have to pay the theta(n) complexity.

挽清梦 2024-08-03 14:58:45
  1. 如果你想将一个元素插入到“正确”的位置,为什么你打算使用排序。 使用 lower_bound 查找位置并使用向量的“insert”方法插入。 插入新项目的时间复杂度仍然是 O(N)。

  2. 堆不会帮助你,因为堆没有排序。 它允许您快速获取最小元素,然后快速删除它并获取下一个最小元素。 但是,堆中的数据不是按排序顺序存储的,因此如果您有必须按顺序迭代数据的算法,那将无济于事。

恐怕您的描述过于详细,但似乎列表并不是完成该任务的正确元素。 std::deque 更适合在中间插入,您也可以考虑 std::set。 我建议您解释为什么需要对数据进行排序以获得更多有用的建议。

  1. If you want insert an element into the "right" position, why do you plan on using sort. Find the position using lower_bound and insert, using, well, `insert' method of the vector. That will still be O(N) to insert new item.

  2. heap is not going to help you, because heap is not sorted. It allows you get get at the smallest element quickly, and then quickly remove it and get next smallest element. However, the data in heap is not stored in sort order, so if you have algorithms that must iterate over data in order, it will not help.

I am afraid you description skimmed to much detail, but it seems like list is just not the right element for the task. std::deque is much better suited for insertion in the middle, and you might also consider std::set. I suggest you explain why you need to keep the data sorted to get more helpful advice.

无所谓啦 2024-08-03 14:58:45

您可能需要考虑使用 BTree 或 Judy Trie。

  • 您不想对大型集合使用连续内存,插入不应花费 O(n) 时间;
  • 您希望对单个元素至少使用二进制插入,应该对多个元素进行预排序,以便可以使搜索边界更小;
  • 您不希望数据结构浪费内存,因此每个数据元素都没有左指针和右指针。

You might want to consider using a BTree or a Judy Trie.

  • You don't want to use contiguous memory for large collections, insertions should not take O(n) time;
  • You want to use at least binary insertion for single elements, multiple elements should be presorted so you can make the search boundaries smaller;
  • You do not want your data structure wasting memory, so nothing with left and right pointers for each data element.
沫雨熙 2024-08-03 14:58:45

正如其他人所说,我可能会从链表中创建 BTree,而不是使用向量。 即使您解决了排序问题,假设您事先不知道最大大小,向量在需要增长时也存在完全重新分配的问题。

如果您担心列表分配在不同的内存页面上并导致与缓存相关的性能问题,请在数组中预分配节点(池对象)并将它们插入列表中。

您可以在数据类型中添加一个值,表示它是从堆分配还是从池分配。 这样,如果您检测到池空间不足,您可以开始从堆中分配并向自己抛出一个断言或其他内容,这样您就知道要增加池大小(或将其设为要设置的命令行选项)。

希望如此有帮助,因为我看到你已经有很多很好的答案。

As others have said I'd probably have created a BTree out of a linked list instead of using a vector. Even if you got past the sorting issue, vectors have the problem of fully reallocating when they need to grow, assuming you don't know your maximum size before hand.

If you are worried about a list allocating on different memory pages and causing cache related performance issues, preallocate your nodes in an array, (pool the objects) and insert these into the list.

You can add a value in your data type that denotes if it is allocated off the heap or from a pool. This way if you detect that your pool runs out of room, you can start allocating off the heap and throw an assert or something to yourself so you know to bump up the pool size (or make this a command line option to set.

Hope this helps, as I see you already have lots of great answers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文