使用红黑树进行排序

发布于 2024-09-10 10:16:42 字数 267 浏览 1 评论 0原文

红黑树上插入的最坏情况运行时间是O(lg n)并且如果我执行按顺序遍历在树上,我基本上访问每个节点,因此打印排序集合的最坏情况运行时间将是 O(n lg n)

我很好奇,为什么不首选红黑树通过快速排序进行排序(其平均情况运行时间为O(n lg n)

我看到这可能是因为红黑树做没有就地排序,但我不确定,所以也许有人可以提供帮助。

The worst-case running time of insertion on a red-black tree is O(lg n) and if I perform a in-order walk on the tree, I essentially visit each node, so the total worst-case runtime to print the sorted collection would be O(n lg n)

I am curious, why are red-black trees not preferred for sorting over quick sort (whose average-case running time is O(n lg n).

I see that maybe because red-black trees do not sort in-place, but I am not sure, so maybe someone could help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

请你别敷衍 2024-09-17 10:16:42

了解哪种排序算法性能更好实际上取决于您的数据和情况。

如果你用一般/实用的术语来说,

快速排序(随机选择枢轴/只选择一个固定的枢轴,使最坏情况为 Omega(n^2))可能比红黑树更好,因为(不一定按顺序)的重要性)

  • 快速排序已经就位。这可以保持较低的内存占用。假设这个快速排序例程是处理大量数据的程序的一部分。如果您继续使用大量内存,您的操作系统可能会开始交换进程内存并破坏您的性能。

  • 快速排序内存访问是本地化的。这与缓存/交换配合得很好。

  • 快速排序可以轻松并行化(现在可能更相关)。

  • 如果您尝试通过使用数组来优化二叉树排序(使用不平衡的二叉树),您最终会做类似快速排序的事情!

  • 红黑树有内存开销。您必须可能多次分配节点,树的内存要求是使用数组的两倍/三倍。

  • ,假设您想要第 1045 个(比如说)元素,您将需要在树中维护顺序统计信息(因此需要额外的内存成本),并且您将拥有 O(logn) 访问时间!

  • 红黑树有开销只是为了访问下一个元素(指针查找)

  • 红黑树表现不佳与缓存和指针访问可能会导致更多交换。

  • 红黑树中的旋转将增加 O(nlogn) 中的常数因子。

  • 也许最重要的原因(但如果您有可用的库等则无效),快速排序非常容易理解和实现。即使是小学生也能理解它!

我想说你尝试衡量这两种实现,看看会发生什么!

此外,Bob Sedgewick 还写了一篇关于快速排序的论文!可能值得一读。

Knowing which sort algorithm performs better really depend on your data and situation.

If you are talking in general/practical terms,

Quicksort (the one where you select the pivot randomly/just pick one fixed, making worst case Omega(n^2)) might be better than Red-Black Trees because (not necessarily in order of importance)

  • Quicksort is in-place. The keeps your memory footprint low. Say this quicksort routine was part of a program which deals with a lot of data. If you kept using large amounts of memory, your OS could start swapping your process memory and trash your perf.

  • Quicksort memory accesses are localized. This plays well with the caching/swapping.

  • Quicksort can be easily parallelized (probably more relevant these days).

  • If you were to try and optimize binary tree sorting (using binary tree without balancing) by using an array instead, you will end up doing something like Quicksort!

  • Red-Black trees have memory overheads. You have to allocate nodes possibly multiple times, your memory requirements with trees is doubles/triple that using arrays.

  • After sorting, say you wanted the 1045th (say) element, you will need to maintain order statistics in your tree (extra memory cost because of this) and you will have O(logn) access time!

  • Red-black trees have overheads just to access the next element (pointer lookups)

  • Red-black trees do not play well with the cache and the pointer accesses could induce more swapping.

  • Rotation in red-black trees will increase the constant factor in the O(nlogn).

  • Perhaps the most important reason (but not valid if you have lib etc available), Quicksort is very simple to understand and implement. Even a school kid can understand it!

I would say you try to measure both implementations and see what happens!

Also, Bob Sedgewick did a thesis on quicksort! Might be worth reading.

看海 2024-09-17 10:16:42

有很多排序算法在最坏情况下都是O(n log n) - 例如,合并排序。首选快速排序的原因是它在实践中速度更快,尽管在算法上它可能不如其他一些算法。

通常,内置排序根据 n 的值使用各种方法的组合。

There are plenty of sorting algorithms which are worst case O(n log n) - for example, merge sort. The reason quicksort is preferred is because it is faster in practice, even though algorithmically it may not be as good as some other algorithms.

Often in-built sorts use a combination of various methods depending on the values of n.

眼藏柔 2024-09-17 10:16:42

在很多情况下,红背树对于排序来说还不错。我的测试表明,与自然合并排序相比,红黑树在以下方面表现出色:

树更适合重复:
所有需要消除重复的测试,树算法更好。这并不令人惊讶,因为树从一开始就可以保持非常小,因此为内联数组排序设计的算法可能会在更长的时间内传递更大的段。

树更适合随机:
所有测试均采用随机、树算法较好。这也并不令人惊讶,因为在树中元素之间的距离较短并且不需要移动。因此,重复插入树可能比对数组进行排序需要更少的工作。

所以我们得到的印象是自然合并排序只在升序和降序特殊情况下表现出色。甚至不能说快速排序。

此处的测试用例要点。

PS:应该注意的是,使用树进行排序并不简单。人们不仅需要提供一个插入例程,还需要一个可以将树线性化回数组的例程。我们当前正在使用 get_last 和前置例程,它不需要堆栈。但这些例程不是 O(1),因为它们包含循环。

There are many cases where red-back trees are not bad for sorting. My testing showed, compared to natural merge sort, that red-black trees excel where:

Trees are better for Dups:
All the tests where dups need to be eleminated, tree algorithm is better. This is not astonishing, since the tree can be kept very small from the beginning, whereby algorithms that are designed for inline array sort might pass around larger segments for a longer time.

Trees are better for Random:
All the tests with random, tree algorithm is better. This is also not astonishing, since in a tree distance between elements is shorter and shifting is not necessary. So repeatedly inserting into a tree could need less effort than sorting an array.

So we get the impression that the natural merge sort only excels in ascending and descending special cases. Which cant be even said for quick sort.

Gist with the test cases here.

P.S.: it should be noted that using trees for sorting is non-trivial. One has not only to provide an insert routine but also a routine that can linearize the tree back to an array. We are currently using a get_last and a predecessor routine, which doesn't need a stack. But these routines are not O(1) since they contain loops.

葬﹪忆之殇 2024-09-17 10:16:42

Big-O 时间复杂度度量通常不考虑标量因子,例如,O(2n) 和 O(4n) 通常只是减少到 O(n)。时间复杂度分析基于算法级别的操作步骤,而不是严格的编程级别,即不考虑源代码或本机机器指令。

快速排序通常比基于树的排序更快,因为(1)这些方法具有相同的算法平均时间复杂度,并且(2)与红黑树相比,使用简单数组时,查找和交换操作需要更少的程序命令和数据访问,即使树使用底层基于数组的实现。与快速排序的简单数组分区交换步骤相比,红黑树约束的维护需要额外的操作步骤、数据字段值存储/访​​问(节点颜色)等。

最终结果是红黑树比快速排序具有更高的标量系数,但标准 O(n log n) 平均时间复杂度分析结果掩盖了这一点。

Wikipedia 上的快速排序文章中简要讨论了与机器架构相关的其他一些实际注意事项

Big-O time complexity measures do not usually take into account scalar factors, e.g., O(2n) and O(4n) are usually just reduced to O(n). Time complexity analysis is based on operational steps at an algorithmic level, not at a strict programming level, i.e., no source code or native machine instruction considerations.

Quicksort is generally faster than tree-based sorting since (1) the methods have the same algorithmic average time complexity, and (2) lookup and swapping operations require fewer program commands and data accesses when working with simple arrays than with red-black trees, even if the tree uses an underlying array-based implementation. Maintenance of the red-black tree constraints requires additional operational steps, data field value storage/access (node colors), etc than the simple array partition-exchange steps of a quicksort.

The net result is that red-black trees have higher scalar coefficients than quicksort does that are being obscured by the standard O(n log n) average time complexity analysis result.

Some other practical considerations related to machine architectures are briefly discussed in the Quicksort article on Wikipedia

万水千山粽是情ミ 2024-09-17 10:16:42

一般来说,O(nlgn) 算法的表示可以扩展到 A*nlgn + B,其中 A 和 B 是常数。有许多算法证明表明快速排序的系数比其他算法的系数小。这是最好的情况(快速排序在排序数据上的表现非常糟糕)。

Generally, representations of O(nlgn) algorithms can be expanded to A*nlgn + B where A and B are constants. There are many algorithmic proofs that show the coefficients for quicksort are smaller than those of other algorithms. That is in best-case (quick sort performs horribly on sorted data).

坚持沉默 2024-09-17 10:16:42

您好,我认为解释所有排序例程之间差异的最佳方法是。
(我的答案是针对那些对快速排序在实践中比其他排序算法更快的情况感到困惑的人)。

“认为你正在一台非常慢的计算机上运行”。

  1. 首先,比较操作需要 1 小时。
  2. 一次换班操作需要2小时。

“我用小时只是为了让人们了解时间的重要性”。

现在,从所有排序操作来看,快速排序的比较次数和元素交换都非常少。

由于这个主要原因,快速排序速度更快。

Hi the best way to explain the difference between all sorting routine in my opinion is.
(My answer is for people who are confused how quick sort is faster in practice than another sorting algo).

"Think u are running on a very slow computer".

  1. First thing one comparing operation takes 1 hour.
  2. One shifting operation takes 2 hours.

"I am using hour just to make people understand how important time is".

Now from all the sorting operations quick-sort have very very less comparisons and very less swapping for elements.

Quick-sort is faster for this main reason.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文