timsort和quicksort的比较
为什么我经常听说 Quicksort 是最快的整体排序算法,根据 Wikipedia ,Timsort 好像表现好多了?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
为什么我经常听说 Quicksort 是最快的整体排序算法,根据 Wikipedia ,Timsort 好像表现好多了?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(6)
TimSort 是一种高度优化的归并排序,它比旧的归并排序稳定且更快。
与快速排序相比,它有两个优点:
老实说,我不认为#1 是一个优势,但它确实给我留下了深刻的印象。
QuickSort 的优点如下:
目前,Java 7 SDK 实现了 timsort 和一个新的快速排序变体:即 Dual Pivot QuickSort。
如果需要稳定排序,请尝试 timsort,否则从快速排序开始。
TimSort is a highly optimized mergesort, it is stable and faster than old mergesort.
when comparing with quicksort, it has two advantages:
To be honest, I don't think #1 is a advantage, but it did impress me.
Here are QuickSort's advantages
Currently, Java 7 SDK implements timsort and a new quicksort variant: i.e. Dual Pivot QuickSort.
If you need stable sort, try timsort, otherwise start with quicksort.
或多或少,这与 Timsort 是一种混合排序算法有关。这意味着虽然它使用的两种底层排序(合并排序和插入排序)都比 快速排序 更糟糕对于许多类型的数据,Timsort 仅在有利时才使用它们。
在更深层次上,正如 Patrick87 所说,快速排序是最坏情况的 O(n2) 算法。选择一个好的主元并不难,但保证 O(n log n )快速排序的代价是平均排序速度通常较慢。
有关 Timsort 的更多详细信息,请参阅此答案以及链接的博客文章。它基本上假设大多数数据已经部分排序,并构造排序数据的“运行”,以便使用合并排序进行有效合并。
More or less, it has to do with the fact that Timsort is a hybrid sorting algorithm. This means that while the two underlying sorts it uses (Mergesort and Insertion sort) are both worse than Quicksort for many kinds of data, Timsort only uses them when it is advantageous to do so.
On a slightly deeper level, as Patrick87 states, quicksort is a worst-case O(n2) algorithm. Choosing a good pivot isn't hard, but guaranteeing an O(n log n) quicksort comes at the cost of generally slower sorting on average.
For more detail on Timsort, see this answer, and the linked blog post. It basically assumes that most data is already partially sorted, and constructs "runs" of sorted data that allow for efficient merges using mergesort.
一般来说,快速排序是原始数组的最佳算法。这是由于内存局部性和缓存造成的。
JDK7 使用 TimSort 作为对象数组。对象数组仅保存对象引用。对象本身存储在Heap中。为了比较对象,我们需要从堆中读取对象。这就像从堆的一部分读取一个对象,然后从堆的另一部分随机读取对象。会有大量的缓存未命中。我想由于这个原因,内存局部性不再重要了。这可能就是 JDK 只使用 TimSort 作为对象数组而不是原始数组的原因。
这只是我的猜测。
Generally speaking quicksort is best algorithm for primitive array. This is due to memory locality and cache.
JDK7 uses TimSort for Object array. Object array only holds object reference. The object itself is stored in Heap. To compare object, we need to read object from heap. This is like reading from one part of the heap for one object, then randomly reading object from another part of heap. There will be a lot of cache miss. I guess for this reason memory locality is not important any more. This is may be why JDK only uses TimSort for Object array instead if primitive array.
This is only my guess.
如果您需要保留顺序的排序,或者您要对复杂数组(比较基于堆的对象)而不是原始数组进行排序,那么 Tim Sort 就非常有用。正如其他人所提到的,快速排序从数据的局部性和原始数组的处理器缓存中受益匪浅。
提出了快速排序最坏情况是 O(n^2) 的事实。幸运的是,您可以使用快速排序实现 O(n log n) 时间的最坏情况。当枢轴点是最小值或最大值时,例如当枢轴是已排序数组的第一个或最后一个元素时,快速排序最坏的情况就会发生。
通过将主元设置为中值,我们可以实现 O(n log n) 最坏情况快速排序。因为找到中值可以在线性时间 O(n) 内完成。由于 O(n) + O(n log n) = O(n log n),这成为最坏情况的时间复杂度。
然而,在实践中,大多数实现发现随机主元就足够了,因此不搜索中值。
Tim Sort is great if you need an order-preserving sort, or if you are sorting a complex array (comparing heap-based objects) rather than a primitive array. As mentioned by others, quicksort benefits significantly from the locality of data and processor caching for primitive arrays.
The fact that the worst case of quicksort is O(n^2) was raised. Fortunately, you can achieve O(n log n) time worst-case with quicksort. The quicksort worst-case occurs when the pivot point is either the smallest or largest value such as when the pivot is the first or last element of an already sorted array.
We can achieve O(n log n) worst-case quicksort by setting the pivot at the median value. Since finding the median value can be done in linear time O(n). Since O(n) + O(n log n) = O(n log n), that becomes the worst-case time complexity.
In practice, however, most implementations find that a random pivot is sufficient so do not search for the median value.
以下是我的机器的基准测试数据(i7-6700 CPU,3.4GHz,Ubuntu 16.04,gcc 5.4.0,参数:SIZE=100000 和 RUNS=3):
基准测试来自 Swenson 的 sort 项目,他在其中用 C 实现了多种排序算法。 大概,他的实现足够好,具有代表性,但我还没有调查它们。
所以你真的说不出来。基准数据最多只能保持两年的相关性,然后你就必须重复它们。也许早在 2011 年提出这个问题时,timsort 就击败了 qsort waaay,但时代已经变了。或者 qsort 总是最快的,但 timsort 在非随机数据上击败了它。或者 Swenson 的代码不太好,更好的程序员会扭转局势,对 timsort 有利。或者也许我很糟糕,在编译代码时没有使用正确的
CFLAGS
。或者...你明白了。Here are benchmark numbers from my machine (i7-6700 CPU, 3.4GHz, Ubuntu 16.04, gcc 5.4.0, parameters: SIZE=100000 and RUNS=3):
The benchmark comes from Swenson's sort project in which he as implemented several sorting algorithms in C. Presumably, his implementations are good enough to be representative, but I haven't investigated them.
So you really can't tell. Benchmark numbers only stay relevant for at most two years and then you have to repeat them. Possibly, timsort beat qsort waaay back in 2011 when the question was asked, but the times have changed. Or qsort was always the fastest, but timsort beat it on non-random data. Or Swenson's code isn't so good and a better programmer would turn the tide in timsort's favor. Or perhaps I suck and didn't use the right
CFLAGS
when compiling the code. Or... You get the point.Timsort 是一种流行的混合排序算法,由 Tim Peters 于 2002 年设计。它是插入排序和归并排序的结合。它的开发是为了在各种现实世界数据集上表现良好。它是一种快速、稳定和自适应的排序技术,平均和最坏情况下的性能为
O(n log n)
。Timsort 的工作原理
Timsort的优点
快速排序是一种非常有用且高效的排序算法,它将大量数据划分为较小的数据,它基于分而治之的概念。 Tony Hoare 在 1959 年设计了这种排序算法,平均性能为
O(n log n)
。快速排序的工作原理
Quicksort 的优点
Timsort is a popular hybrid sorting algorithm designed in 2002 by Tim Peters. It is a combination of insertion sort and merge sort. It is developed to perform well on various kinds of real world data sets. It is a fast, stable and adaptive sorting technique with average and worst-case performance of
O(n log n)
.How Timsort works
Advantages of Timsort
Quicksort is a highly useful and efficient sorting algorithm that divides a large array of data into smaller ones and it is based on the concept of Divide and Conquer. Tony Hoare designed this sorting algorithm in 1959 with average performance of
O(n log n)
.How Quicksort works
Advantages of Quicksort