快速排序与堆排序

发布于 2024-08-25 18:05:55 字数 38 浏览 6 评论 0原文

快速排序和堆排序都进行就地排序。哪个更好?首选哪种应用和案例?

Both quicksort and heapsort do in-place sorting. Which is better? What are the applications and cases in which either is preferred?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

残花月 2024-09-01 18:05:56

处理非常大的输入时,堆排序是一个安全的选择。渐近分析表明,最坏情况下堆排序的增长顺序是 Big-O(n logn),这比 Quicksort 的 Big-O(n^2) 更好最坏的情况。然而,在大多数机器上,堆排序在实践中比实现良好的快速排序要慢一些。堆排序也不是一种稳定的排序算法。

堆排序在实践中比快速排序慢的原因是由于引用的位置更好(“https://en. wikipedia.org/wiki/Locality_of_reference")在快速排序中,数据元素位于相对较近的存储位置。表现出很强的引用局部性的系统是性能优化的最佳候选者。然而,堆排序处理更大的跳跃。这使得快速排序对于较小的输入更有利。

Heap Sort is a safe bet when dealing with very large inputs. Asymptotic analysis reveals order of growth of Heapsort in the worst case is Big-O(n logn), which is better than Quicksort's Big-O(n^2) as a worst case. However, Heapsort is somewhat slower in practice on most machines than a well-implemented quick sort. Heapsort is also not a stable sorting algorithm.

The reason heapsort is slower in practice than quicksort is due to the better locality of reference ("https://en.wikipedia.org/wiki/Locality_of_reference") in quicksort, where data elements are within relatively close storage locations. Systems that exhibit strong locality of reference are great candidates for performance optimization. Heap sort, however, deals with larger leaps. This makes quicksort more favorable for smaller inputs.

疑心病 2024-09-01 18:05:56

简单来说>>与 QuickSort 不同,HeapSort 保证在最坏情况下运行时间为“O(n log n)”
“O(n log n)”的〜平均〜运行时间。 QuickSort 通常在实践中使用,因为通常它速度更快,但是
当您需要对不适合内存的大文件进行排序时,HeapSort 用于外部排序
电脑。

in simple terms >> HeapSort have guaranteed ~worst-case~ running time of "O(n log n)" as opposed to QuickSort’s
~average~ running time of "O(n log n)". QuickSort is usually used in practice, because typically it is faster, but
HeapSort is used for external sort when you need to sort huge files that don’t fit into memory of your
computer.

铃予 2024-09-01 18:05:56

实际上,当您主要关心速度时,几乎没有什么比快速排序更好的了,除非您可以使用 RadixSort(RadixSort 通常仅在对数字进行排序时才可能)。 QuickSort 通常以相当大的优势击败 HeapSort 和 MergeSort。

如果不断增加要排序的元素数量,则会出现一个收支平衡点,HeapSort 和 MergeSort 将超过 QuickSort,因为它们始终为 O(n * log2(n)),而 QuickSort 则为通常仅在O(n * log2(n))范围内,并且它的实际执行方式在很大程度上取决于要排序的数据和实现细节。因此,即使 QuickSort 最初速度更快,但随着元素数量的增加,其速度下降得比 HeapSort 或 MergeSort 更快,这意味着在某些时候两者都会超过 QuickSort。然而,实际上,通常要排序的数据集永远不会变得足够大,甚至无法达到收支平衡点,或者甚至无法变得足够大,因为在此之前您的系统将耗尽资源。

只有少数不幸的情况下,快速排序可能会严重失败,具体取决于您的实现,例如,当初始数据集已经排序或反向排序或遵循另一种奇怪的排序模式时(例如,每对元素都反向排序,所有元素对都排序)。然而,通过使 QuickSort 实现更加健壮,可以避免这些问题。典型的优化包括:

  • 不是选择始终作为中心元素的枢轴元素,而是随机选择它
  • 不是只选择一个枢轴元素,而是选择三个(从固定位置或随机),然后取中间的一个
  • 而不是分区分成两个子数组,划分为三个(小于 Pivot、大于 Pivot、等于 Pivot;通常相等的元素要么保存在较小的子数组中,要么保存在较大的子数组中)。
  • 一旦要排序的子数组的大小低于某个阈值(4-16 左右的数字很流行),请通过替代算法对该子数组进行排序(插入排序对于此任务非常流行,但几乎任何回退都可以) ;这是为了避免在辅助数组中需要太多递归或太多空间)
  • 您也可以在低于某个阈值时完全停止排序,然后对整个数据运行替代算法,而不是切换到低于某个阈值的不同算法现在将受益于几乎已排序的数据集(这仅在您选择正确的第二个算法时才有效)
  • 在应用快速排序之前对数据集进行洗牌(这看起来很愚蠢,但对完全未排序的数据进行洗牌不会使快速排序变慢,但如果数据恰好经过预排序,它可以避免出现问题)

所有这些更改都会使 QuickSort 变慢,有时会慢一点,有时会慢一点,但它们也可以防止 QuickSort 变得慢得难以忍受的情况,否则影响很小如果您的数据或多或少是随机的,但如果不是,则会产生巨大的积极影响。

In practice, barely anything ever beats QuickSort when speed is your main concern, unless you can use RadixSort (RadixSort is usually only possible when sorting numbers). QuickSort typically beats HeapSort and MergeSort by a fair margin.

If you keep increasing the number of elements to be sorted, there is a break even point where HeapSort and MergeSort will overtake QuickSort, as they are always O(n * log2(n)), whereas QuickSort is only typically in the range of O(n * log2(n)) and how it really performs depends a lot on the data to be sorted and implementation details. So even if QuickSort is initially faster, its speed decreases quicker with number of elements raising than is the case for HeapSort or MergeSort and that means at some point both will overtake QuickSort. However, in practice usually your data sets to be sorted will never get big enough to even reach that break even point or cannot even get big enough as your system would run out of resources prior to that.

There are only a few unfortunate cases where QuickSort may fail horribly depending on your implementation, e.g. when the initial dataset is already sorted or reverse sorted or follows another weird sorting pattern (e.g. every pair of elements is reverse sorted and all pairs are sorted). Yet those can be avoided by making the QuickSort implementation more robust. Typical optimizations include:

  • Instead of choosing the Pivot element to be always the element in the center, choose it randomly
  • Instead of just picking a Pivot element, pick three (from fixed positions or randomly), then take the one in the middle
  • Instead of partitioning into two sub-arrays, partition into three (smaller than Pivot, bigger than Pivot, and equal to Pivot; usually equal elements are either kept in the smaller or bigger sub-array).
  • Once the size of the sub-array to be sorted falls below a certain threshold (numbers around 4-16 are popular), sort this sub-array by an alternative algorithm (InsertionSort is quite popular for this task but pretty much any fallback will do; this is to avoid requiring too many recursions or too much space in an auxiliary array)
  • Instead of switching to a different algorithm below a certain threshold, you can also just stop sorting completely below that threshold, then run an alternative algorithm over the entire data that that will now benefit from a data set is almost sorted (this only works if you choose the right second algorithms)
  • Shuffle the data set prior to applying QuickSort (this seems stupid, but shuffling totally unsorted data won't make QuickSort any slower, yet it avoids issues if the data happens to be pre-sorted)

All of these changes will make QuickSort slower, sometimes a tiny bit, sometimes a bit more, but they also prevent situations where QuickSort would become unbearable slow otherwise and have very little impact if your data is more or less random but a huge positive impact if it isn't.

深府石板幽径 2024-09-01 18:05:56

为了回答最初的问题并解决这里的一些其他评论:

我只是比较了选择、快速、合并和堆排序的实现,看看它们如何相互叠加。答案是它们都有各自的缺点。

长话短说:
快速是最好的通用排序(相当快、稳定且大部分就位)
就我个人而言,我更喜欢堆排序,除非我需要稳定的排序。

选择 - N^2 - 它实际上只适用于少于 20 个元素左右,然后它的表现就更好了。除非你的数据已经排序,或者非常非常接近排序。 N^2 变得非常慢非常快。

根据我的经验,“快”实际上并不总是那么快。使用快速排序作为一般排序的好处是它相当快并且稳定。它也是一种就地算法,但由于它通常是递归实现的,因此会占用额外的堆栈空间。它也介于 O(n log n) 和 O(n^2) 之间。某些类型的时间似乎证实了这一点,特别是当值落在一个狭窄的范围内时。它比 10,000,000 个项目的选择排序快得多,但比合并或堆慢。

合并排序保证 O(n log n),因为它的排序不依赖于数据。它只是做它该做的事,不管你给它什么值。它也很稳定,但如果您不小心实现,非常大的排序可能会耗尽您的堆栈。有一些复杂的就地合并排序实现,但通常您需要在每个级别中使用另一个数组来将您的值合并到其中。如果这些数组位于堆栈上,您可能会遇到问题。

堆排序的最大复杂度是 O(n log n),但在许多情况下更快,具体取决于您必须将值在 log n 深堆中向上移动多远。堆可以很容易地在原始数组中就地实现,因此它不需要额外的内存,而且它是迭代的,所以不用担心递归时堆栈溢出。堆排序的巨大缺点是它不是一种稳定的排序,这意味着如果您需要它,它就是不合适的。

To answer the original question and address some of the other comments here:

I just compared implementations of selection, quick, merge, and heap sort to see how they'd stack up against each other. The answer is that they all have their downsides.

TL;DR:
Quick is the best general purpose sort (reasonably fast, stable, and mostly in-place)
Personally I prefer heap sort though unless I need a stable sort.

Selection - N^2 - It's really only good for less than 20 elements or so, then it's outperformed. Unless your data is already sorted, or very, very nearly so. N^2 gets really slow really fast.

Quick, in my experience, is not actually that quick all the time. Bonuses for using quick sort as a general sort though are that it's reasonably fast and it's stable. It's also an in-place algorithm, but as it's generally implemented recursively, it will take up additional stack space. It also falls somewhere between O(n log n) and O(n^2). Timing on some sorts seem to confirm this, especially when the values fall within a tight range. It's way faster than selection sort on 10,000,000 items, but slower than merge or heap.

Merge sort is guaranteed O(n log n) since its sort is not data dependent. It just does what it does, regardless of what values you've given it. It's also stable, but very large sorts can blow out your stack if you're not careful about implementation. There are some complex in-place merge sort implementations, but generally you need another array in each level to merge your values into. If those arrays live on the stack you can run into issues.

Heap sort is max O(n log n), but in many cases is quicker, depending on how far you have to move your values up the log n deep heap. The heap can easily be implemented in-place in the original array, so it needs no additional memory, and it's iterative, so no worries about stack overflow while recursing. The huge downside to heap sort is that it is not a stable sort, which means it's right out if you need that.

乞讨 2024-09-01 18:05:55

堆排序是 O(N log N) 保证的,这比快速排序中最坏的情况要好得多。堆排序不需要更多内存来让另一个数组像合并排序那样放置有序数据。那么为什么商业应用程序坚持使用快速排序呢?与其他实现相比,快速排序有何特别之处?

我自己测试了这些算法,发现快速排序确实有一些特别之处。它运行速度很快,比Heap和Merge算法快得多。

快速排序的秘密是:它几乎不进行不必要的元素交换。交换很费时间。

使用堆排序,即使所有数据都已排序,您也将交换 100% 的元素来对数组进行排序。

如果使用合并排序,情况会更糟。您将把 100% 的元素写入另一个数组中,然后将其写回到原始数组中,即使数据已经排序。

使用快速排序,您不会交换已订购的内容。如果你的数据是完全有序的,那么你几乎不需要交换任何东西!尽管对于最坏的情况有很多争论,但在主元的选择上稍加改进,除了获取数组的第一个或最后一个元素之外,都可以避免这种情况。如果从第一个、最后一个和中间元素之间的中间元素获得主元,就足以避免最坏的情况。

Quicksort的优越之处不是最坏的情况,而是最好的情况!在最好的情况下,你会进行相同数量的比较,好吧,但你几乎什么也不交换。一般情况下,您会交换部分元素,但不是全部元素,如堆排序和合并排序。这就是快速排序的最佳时机。更少的交换,更快的速度。

下面在我的计算机上用 C# 实现,在发布模式下运行,使用中间枢轴比 Array.Sort 快 3 秒,使用改进枢轴比 Array.Sort 快 2 秒(是的,获得良好枢轴需要一定的开销)。

static void Main(string[] args)
{
    int[] arrToSort = new int[100000000];
    var r = new Random();
    for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);

    Console.WriteLine("Press q to quick sort, s to Array.Sort");
    while (true)
    {
        var k = Console.ReadKey(true);
        if (k.KeyChar == 'q')
        {
            // quick sort
            Console.WriteLine("Beg quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            QuickSort(arrToSort, 0, arrToSort.Length - 1);
            Console.WriteLine("End quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
        else if (k.KeyChar == 's')
        {
            Console.WriteLine("Beg Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            Array.Sort(arrToSort);
            Console.WriteLine("End Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
    }
}

static public void QuickSort(int[] arr, int left, int right)
{
    int begin = left
        , end = right
        , pivot
        // get middle element pivot
        //= arr[(left + right) / 2]
        ;

    //improved pivot
    int middle = (left + right) / 2;
    int
        LM = arr[left].CompareTo(arr[middle])
        , MR = arr[middle].CompareTo(arr[right])
        , LR = arr[left].CompareTo(arr[right])
        ;
    if (-1 * LM == LR)
        pivot = arr[left];
    else
        if (MR == -1 * LR)
            pivot = arr[right];
        else
            pivot = arr[middle];
    do
    {
        while (arr[left] < pivot) left++;
        while (arr[right] > pivot) right--;

        if(left <= right)
        {
            int temp = arr[right];
            arr[right] = arr[left];
            arr[left] = temp;

            left++;
            right--;
        }
    } while (left <= right);

    if (left < end) QuickSort(arr, left, end);
    if (begin < right) QuickSort(arr, begin, right);
}

Heapsort is O(N log N) guaranted, what is much better than worst case in Quicksort. Heapsort doesn't need more memory for another array to putting ordered data as is needed by Mergesort. So why do comercial applications stick with Quicksort? What Quicksort has that is so special over others implementations?

I've tested the algorithms myself and I've seen that Quicksort has something special indeed. It runs fast, much faster than Heap and Merge algorithms.

The secret of Quicksort is: It almost doesn't do unnecessary element swaps. Swap is time consuming.

With Heapsort, even if all of your data is already ordered, you are going to swap 100% of elements to order the array.

With Mergesort, it's even worse. You are going to write 100% of elements in another array and write it back in the original one, even if data is already ordered.

With Quicksort you don't swap what is already ordered. If your data is completely ordered, you swap almost nothing! Although there is a lot of fussing about worst case, a little improvement on the choice of pivot, any other than getting the first or last element of array, can avoid it. If you get a pivot from the intermediate element between first, last and middle element, it is suficient to avoid worst case.

What is superior in Quicksort is not the worst case, but the best case! In best case you do the same number of comparisons, ok, but you swap almost nothing. In average case you swap part of the elements, but not all elements, as in Heapsort and Mergesort. That is what gives Quicksort the best time. Less swap, more speed.

The implementation below in C# on my computer, running on release mode, beats Array.Sort by 3 seconds with middle pivot and by 2 seconds with improved pivot (yes, there is an overhead to get a good pivot).

static void Main(string[] args)
{
    int[] arrToSort = new int[100000000];
    var r = new Random();
    for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);

    Console.WriteLine("Press q to quick sort, s to Array.Sort");
    while (true)
    {
        var k = Console.ReadKey(true);
        if (k.KeyChar == 'q')
        {
            // quick sort
            Console.WriteLine("Beg quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            QuickSort(arrToSort, 0, arrToSort.Length - 1);
            Console.WriteLine("End quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
        else if (k.KeyChar == 's')
        {
            Console.WriteLine("Beg Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            Array.Sort(arrToSort);
            Console.WriteLine("End Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
    }
}

static public void QuickSort(int[] arr, int left, int right)
{
    int begin = left
        , end = right
        , pivot
        // get middle element pivot
        //= arr[(left + right) / 2]
        ;

    //improved pivot
    int middle = (left + right) / 2;
    int
        LM = arr[left].CompareTo(arr[middle])
        , MR = arr[middle].CompareTo(arr[right])
        , LR = arr[left].CompareTo(arr[right])
        ;
    if (-1 * LM == LR)
        pivot = arr[left];
    else
        if (MR == -1 * LR)
            pivot = arr[right];
        else
            pivot = arr[middle];
    do
    {
        while (arr[left] < pivot) left++;
        while (arr[right] > pivot) right--;

        if(left <= right)
        {
            int temp = arr[right];
            arr[right] = arr[left];
            arr[left] = temp;

            left++;
            right--;
        }
    } while (left <= right);

    if (left < end) QuickSort(arr, left, end);
    if (begin < right) QuickSort(arr, begin, right);
}
﹂绝世的画 2024-09-01 18:05:55

本文 有一些分析。

另外,来自维基百科:

最直接的竞争对手
快速排序就是堆排序。堆排序是
通常比
快速排序,但最坏情况运行
时间始终为 θ(nlogn)。快速排序是
通常更快,尽管仍然存在
最坏情况表现的机会
除了 introsort 变体,
当情况不好时切换到堆排序
被检测到。如果事先知道的话
堆排序将会是
有需要的话,直接使用即可
比等待 introsort 更快
切换到它。

This paper has some analysis.

Also, from Wikipedia:

The most direct competitor of
quicksort is heapsort. Heapsort is
typically somewhat slower than
quicksort, but the worst-case running
time is always Θ(nlogn). Quicksort is
usually faster, though there remains
the chance of worst case performance
except in the introsort variant, which
switches to heapsort when a bad case
is detected. If it is known in advance
that heapsort is going to be
necessary, using it directly will be
faster than waiting for introsort to
switch to it.

作死小能手 2024-09-01 18:05:55

在大多数情况下,快与慢一点是无关紧要的……你只是不想让它偶尔变得太慢。尽管您可以调整快速排序以避免速度缓慢的情况,但您会失去基本快速排序的优雅性。因此,对于大多数事情,我实际上更喜欢 HeapSort...您可以以完全简单优雅的方式实现它,并且永远不会出现缓慢的排序。

对于大多数情况下您确实需要最大速度的情况,QuickSort 可能比 HeapSort 更好,但两者都可能不是正确的答案。对于速度关键的情况,值得仔细检查情况的细节。例如,在我的一些速度关键代码中,数据已经排序或接近排序是很常见的(它正在索引多个相关字段,这些字段通常一起上下移动或彼此相反地上下移动,所以一旦你按一个排序,其他的要么排序,要么反向排序,要么接近......其中任何一个都可以杀死快速排序)。对于这种情况,我没有实现......相反,我实现了 Dijkstra 的 SmoothSort ...一个 HeapSort 变体,当已经排序或接近排序时,它是 O(N) ...它不是那么优雅,不太容易理解,但速度很快...阅读http://www.cs.utexas.edu /user/EWD/ewd07xx/EWD796a.PDF 如果您想要一些更具挑战性的编码。

For most situations, having quick vs. a little quicker is irrelevant... you simply never want it to occasionally get waayyy slow. Although you can tweak QuickSort to avoid the way slow situations, you lose the elegance of the basic QuickSort. So, for most things, I actually prefer HeapSort... you can implement it in its full simple elegance, and never get a slow sort.

For situations where you DO want max speed in most cases, QuickSort may be preferred over HeapSort, but neither may be the right answer. For speed-critical situations, it is worth examining closely the details of the situation. For example, in some of my speed-critical code, it is very common that the data is already sorted or near-sorted (it is indexing multiple related fields that often either move up and down together OR move up and down opposite each other, so once you sort by one, the others are either sorted or reverse-sorted or close... either of which can kill QuickSort). For that case, I implemented neither... instead, I implemented Dijkstra's SmoothSort... a HeapSort variant that is O(N) when already sorted or near-sorted... it is not so elegant, not too easy to understand, but fast... read http://www.cs.utexas.edu/users/EWD/ewd07xx/EWD796a.PDF if you want something a bit more challenging to code.

我一直都在从未离去 2024-09-01 18:05:55

快速排序-堆排序就地混合也非常有趣,因为它们中的大多数在最坏的情况下只需要 n*log n 比较(它们相对于渐近的第一项是最优的,因此它们避免了最坏的情况的快速排序),O(log n) 额外空间,并且它们相对于已排序的数据集保留了快速排序的至少“一半”的良好行为。 Dikert 和 Weiss 在 http://arxiv.org/pdf/1209.4214v1 中提出了一个非常有趣的算法。 pdf

  • 选择一个主元 p 作为 sqrt(n) 元素的随机样本的中位数(这可以通过 Tarjan&co 的算法进行最多 24 sqrt(n) 比较,或 5 sqrt(n) )通过更复杂的 Schonhage 蜘蛛工厂算法进行比较);
  • 正如快速排序的第一步一样,将数组分为两部分;
  • 堆化最小部分并使用 O(log n) 额外位来编码堆,其中每个左子节点的值都大于其同级;
  • 递归提取堆的根,向下筛选根留下的空隙,直到到达堆的叶子,然后用从数组其他部分取出的适当元素填充该空隙;
  • 对数组剩余的无序部分进行递归(如果选择 p 作为精确中位数,则根本不存在递归)。

Quicksort-Heapsort in-place hybrids are really interesting, too, since most of them only needs n*log n comparisons in the worst case (they are optimal with respect to the first term of the asymptotics, so they avoid the worst-case scenarios of Quicksort), O(log n) extra-space and they preserve at least "a half" of the good behaviour of Quicksort with respect to already-ordered set of data. An extremely interesting algorithm is presented by Dikert and Weiss in http://arxiv.org/pdf/1209.4214v1.pdf:

  • Select a pivot p as the median of a random sample of sqrt(n) elements (this can be done in at most 24 sqrt(n) comparisons through the algorithm of Tarjan&co, or 5 sqrt(n) comparisons through the much more convoluted spider-factory algorithm of Schonhage);
  • Partition your array in two parts as in the first step of Quicksort;
  • Heapify the smallest part and use O(log n) extra bits to encode a heap in which every left child has a value greater than its sibling;
  • Recursively extract the root of the heap, sift down the lacune left by the root until it reaches a leaf of the heap, then fill the lacune with an appropriate element took from the other part of the array;
  • Recur over the remaining non-ordered part of the array (if p is chosen as the exact median, there is no recursion at all).
无需解释 2024-09-01 18:05:55

比较。在快速排序合并排序之间,由于两者都是就地排序类型,因此快速排序的最坏情况运行时间之间存在差异O(n^2) 对于堆排序,它仍然是 O(n*log(n)) 并且对于平均数据量,快速排序会更有用。由于它是随机算法,因此得到正确答案的概率。在更短的时间内将取决于您选择的枢轴元素的位置。

因此,

良好判断: L 和 G 的大小均小于 3s/4

L 和 G 之一的大小大于 3s/4

错误判断:对于小金额, 我们可以进行插入排序,对于大量数据,可以进行堆排序。

Comp. between quick sort and merge sort since both are type of in place sorting there is a difference between wrost case running time of the wrost case running time for quick sort is O(n^2) and for heap sort it is still O(n*log(n)) and for a average amount of data quick sort will be more useful. Since it is randomized algorithm so the probability of getting correct ans. in less time will depend on the position of pivot element you choose.

So a

Good call: the sizes of L and G are each less than 3s/4

Bad call: one of L and G has size greater than 3s/4

for small amount we can go for insertion sort and for very large amount of data go for heap sort.

秋叶绚丽 2024-09-01 18:05:55

堆排序的优点是最坏的运行情况为O(n*log(n)),因此在快速排序可能表现不佳的情况下(通常主要是排序的数据集),堆排序是首选。

Heapsort has the benefit of having a worst running case of O(n*log(n)) so in cases where quicksort is likely to be performing poorly (mostly sorted data sets generally) heapsort is much preferred.

忆伤 2024-09-01 18:05:55

对我来说,堆排序和快速排序之间有一个非常根本的区别:后者使用递归。在递归算法中,堆随着递归次数的增加而增长。如果 n 很小,这并不重要,但现在我正在对 n=10^9 的两个矩阵进行排序!该程序需要近 10 GB 的 RAM,任何额外的内存都会使我的计算机开始交换到虚拟磁盘内存。我的磁盘是 RAM 磁盘,但仍然交换到它会速度产生巨大差异。因此,在用 C++ 编码的 statpack 中,其中包括可调整维度矩阵(程序员事先未知其大小)以及非参数统计排序,我更喜欢使用堆排序来避免延迟使用非常大的数据矩阵。

To me there is a very fundamental difference between heapsort and quicksort: the latter uses a recursion. In recursive algorithms the heap grows with the number of recursions. This does not matter if n is small, but right now I am sorting two matrices with n=10^9 !!. The program takes almost 10 GB of ram and any extra memory will make my computer to start swapping to virtual disk memory. My disk is a RAM disk, but still swapping to it make a huge difference in speed. So in a statpack coded in C++ that includes adjustable dimension matrices, with size unknown in advance to the programmer, and nonparametric statistical kind of sorting I prefer the heapsort to avoid delays to uses with very big data matrices.

溇涏 2024-09-01 18:05:55

好吧,如果你进入架构级别......我们在缓存内存中使用队列数据结构。因此队列中可用的内容将被排序。就像在快速排序中一样,我们可以将数组划分为任何长度......但在堆中排序(通过使用数组)可能会发生这样的情况:父数组可能不存在于缓存中可用的子数组中,然后必须将其放入缓存内存中......这非常耗时。
这是最好的快速排序!!

Well if you go to architecture level...we use queue data structure in cache memory.so what ever is available in queue will get sorted.As in quick sort we have no issue dividing the array into any lenght...but in heap sort(by using array) it may so happen that the parent may not be present in the sub array available in cache and then it has to bring it in cache memory ...which is time consuming.
That's quicksort is best!!????

秋凉 2024-09-01 18:05:55

Heapsort 构建一个堆,然后重复提取最大项。最坏的情况是 O(n log n)。

但是,如果您看到快速排序的最坏情况,即 O(n2),您会意识到对于大数据来说快速排序并不是一个好的选择。

所以这使得排序成为一件有趣的事情;我相信今天有如此多的排序算法存在的原因是因为它们都在最好的地方“最好”。例如,如果数据已排序,则冒泡排序可以执行快速排序。或者,如果我们对要排序的项目有所了解,那么我们可能可以做得更好。

这可能无法直接回答您的问题,我想添加我的两分钱。

Heapsort builds a heap and then repeatedly extracts the maximum item. Its worst case is O(n log n).

But if you would see the worst case of quick sort, which is O(n2), you would realized that quick sort would be a not-so-good choice for large data.

So this makes sorting is an interesting thing; I believe the reason so many sorting algorithms live today is because all of them are 'best' at their best places. For instance, bubble sort can out perform quick sort if the data is sorted. Or if we know something about the items to be sorted then probably we can do better.

This may not answer your question directly, thought I'd add my two cents.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文