对堆排序有一个直观的理解吗?
在学校,我们目前正在学习 Java 排序算法,我的作业是堆排序。我读了书,试图尽可能多地了解,但似乎我无法理解这个概念。
我并不是要求您为我编写一个 Java 程序,只要您能尽可能简单地向我解释堆排序的工作原理即可。
At school we are currently learning sorting algorithms in Java and I got for my homework the Heap Sort. I did my reading, I tried to find out as much as I could, but it seems I just can't grasp the concept.
I'm not asking you to write me a Java program, if you could just explain to me as simply as you can how the Heap Sort works.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
是的,所以基本上你拿一个堆并取出堆中的第一个节点 - 因为第一个节点保证是最大/最小,具体取决于排序方向。棘手的事情是首先重新平衡/创建堆。
我需要两个步骤来理解堆过程 - 首先将其视为一棵树,了解它,然后将该树转换为一个数组,以便它有用。
第二部分是首先遍历树的宽度,从左到右将每个元素添加到数组中。因此,以下树:
将是 {73,7,12,2,4,9,10,1}
第一部分需要两个步骤:
因此要堆化数字列表,请将每个节点添加到堆中,然后按顺序执行这两个步骤
来创建我的 。上面堆我会加10首先 - 这是唯一的节点,所以没什么可做的。
添加 12,因为它是左侧的子项:
这满足 1,但不满足 2,所以我将交换它们:
添加 7 - 无所事事
添加 73
10 < 73 所以需要交换那些:
12 < 73 所以需要交换这些:
Add 2 - 无所事事
Add 4 - 无所事事
Add 9
7 < 9 - 交换
添加 1 - 无所事事
我们有了堆 :D
现在您只需从顶部删除每个元素,每次将最后一个元素交换到树的顶部,然后重新平衡树:
去掉 73 -将 1 放在其位置
1 < 12 - 所以交换它们
1 < 10 - 所以交换它们
去掉 12 - 替换为 7
7 < 10 - 交换它们
去掉 10 - 替换为 4
4 < 7 - 交换
7 < 9 - 交换
去掉 9 - 替换为 2
2 < 4 - 交换它们
4 < 7 - 交换它们
取下 7 - 替换为 1
1 < 4 - 交换它们
采取 4 - 替换为 1
1 < 2 - 交换它们
Take 2 - 替换为 1
Take 1
排序列表瞧。
Right, so basically you take a heap and pull out the first node in the heap - as the first node is guaranteed to be the largest / smallest depending on the direction of sort. The tricky thing is re-balancing / creating the heap in the first place.
Two steps were required for me to understand the heap process - first of all thinking of this as a tree, getting my head around it, then turning that tree into an array so it could be useful.
The second part of that is to essentially traverse the tree breadth first, left to right adding each element into the array. So the following tree:
Would be {73,7,12,2,4,9,10,1}
The first part requires two steps:
So to heapify a list of numbers you add each one to the heap, then following those two steps in order.
To create my heap above I will add 10 first - it's the only node so nothing to do.
Add 12 as it's child on the left:
This satisfies 1, but not 2 so I will swap them round:
Add 7 - nothing to do
Add 73
10 < 73 so need to swap those:
12 < 73 so need to swap those:
Add 2 - nothing to do
Add 4 - nothing to do
Add 9
7 < 9 - swap
Add 1 - nothing to do
We have our heap :D
Now you just remove each element from the top, swapping in the last element to the top of the tree each time, then re-balancing the tree:
Take 73 off - putting 1 in its place
1 < 12 - so swap them
1 < 10 - so swap them
Take 12 off - replace with 7
7 < 10 - swap them
Take 10 off - replace with 4
4 < 7 - swap
7 < 9 - swap
Take 9 off - replace with 2
2 < 4 - swap them
4 < 7 - swap them
Take 7 off - replace with 1
1 < 4 - swap them
Take 4 - replace with 1
1 < 2 - swap them
Take 2 - replace with 1
Take 1
Sorted list voila.
将堆排序视为选择排序的巧妙优化版本。在选择排序中,排序的工作原理是重复查找尚未正确放置的最大元素,然后将其放入数组中的下一个正确位置。然而,选择排序的运行时间为 O(n2),因为它必须进行 n 轮从一堆元素中查找最大元素(并且最多可以查看 n 个不同的元素)并且将其放置到位。
直观上,堆排序的工作原理是建立一个称为二进制堆的特殊数据结构,可以加快查找速度未放置的数组元素中最大的元素。二叉堆支持以下操作:
在非常高的层面上,该算法的工作原理如下:
这会对数组进行排序,因为 Delete-Max 返回的元素按降序排列。删除所有元素后,即可对数组进行排序。
堆排序是高效的,因为堆上的 Insert 和 Delete-Max 操作都在 O(log n) 时间内运行,这意味着可以在堆上完成 n 次插入和删除堆的时间复杂度为 O(n log n)。 更精确的分析可以表明,事实上,无论输入数组。
通常,堆排序采用两个主要优化。首先,堆通常通过处理数组在数组内就地构建本身作为堆的压缩表示。如果您查看堆排序实现,您通常会看到基于乘法和除法的数组索引的不寻常用法;这些访问之所以有效,是因为它们将数组视为压缩数据结构。因此,该算法仅需要 O(1) 辅助存储空间。
其次,堆通常不是一次构建一个元素,而是使用运行在就地构建堆的时间为 θ(n)。有趣的是,在某些情况下,这最终会使代码更易于阅读,因为代码可以重用,但算法本身变得有点难以理解和分析。
有时您会看到使用三元堆完成堆排序。这样做的优点是平均速度稍快,但如果您发现使用此方法的堆排序实现而不知道您正在查看的内容,那么阅读起来可能会相当困难。其他算法也使用相同的通用结构,但使用更复杂的堆结构。 Smoothsort 使用更复杂的堆来获得 O(n) 最佳情况行为,同时保持 O( 1) 空间使用和 O(n log n) 最坏情况行为。 Poplar排序与smoothsort类似,但空间使用量为O(log n)并且性能稍好一些。人们甚至可以将诸如插入排序和选择排序等经典排序算法视为堆排序变体 。
一旦您更好地掌握了堆排序,您可能需要研究 introsort 算法,该算法结合了快速排序、堆排序和插入排序,产生了一种极快的排序算法,该算法结合了快速排序(平均快速排序)、堆排序(出色的最坏情况行为)和插入排序(对小数据进行快速排序)的优势。数组)。 Introsort 是 C++ 的
std::sort
函数的许多实现中使用的方法,一旦您拥有一个可用的堆排序,那么您自己实现它并不难。希望这有帮助!
One way to think of heap sort is as a cleverly optimized version of selection sort. In selection sort, the sort works by repeatedly finding the largest element not yet placed correctly, then putting it into the next correct spot in the array. However, selection sort runs in time O(n2) because it has to do n rounds of finding the largest element out of a bunch (and there can be up to n different elements to look at) and putting it into place.
Intuitively, heap sort works by building up a special data structure called a binary heap that speeds up finding the largest element out of the unplaced array elements. Binary heaps support the following operations:
At a very high level, the algorithm works as follows:
This sorts the array because the elements returned by Delete-Max are in descending order. Once all the elements have been removed, the array is then sorted.
Heap sort is efficient because the Insert and Delete-Max operations on a heap both run in O(log n) time, meaning that n inserts and deletions can be done on the heap in O(n log n) time. A more precise analysis can be used to show that, in fact, it takes Θ(n log n) time regardless of the input array.
Typically, heap sort employs two major optimizations. First, the heap is usually built up in-place inside the array by treating the array itself as a compressed representation of the heap. If you look at a heapsort implementation, you will usually see unusual uses of array indices based on multiplying and dividing by two; these accesses work because they are treating the array as a condensed data structure. As a result, the algorithm requires only O(1) auxiliary storage space.
Second, rather than building up the heap one element at a time, the heap is usually built using a specialized algorithm that runs in time Θ(n) to build the heap in-place. Interestingly, in some cases this ends up making the code easier to read because code can be reused, but the algorithm itself becomes a bit trickier to understand and analyze.
You will sometimes see heapsort done with a ternary heap. This has the advantage of being slightly faster on average, but if you find a heapsort implementation using this without knowing what you're looking at it can be fairly tricky to read. Other algorithms also use the same general structure but a more complex heap structure. Smoothsort uses a much more complicated heap to get O(n) best-case behavior while maintaining O(1) space usage and O(n log n) worst-case behavior. Poplar sort is similar to smoothsort, but with O(log n) space usage and slightly better performance. One can even think of classic sorting algorithms like insertion sort and selection sort as heap sort variants.
Once you have a better grasp of heapsort, you may want to look into the introsort algorithm, which combines quicksort, heapsort, and insertion sort to produce an extremely fast sorting algorithm that combines the strength of quicksort (fast sorting on average), heapsort (excellent worst-case behavior), and insertion sort (fast sorting for small arrays). Introsort is what's used in many implementations of C++'s
std::sort
function, and is not very hard to implement yourself once you have a working heapsort.Hope this helps!
我会看看我如何回答这个问题,因为我对堆排序和堆是什么的解释会有点......
呃,可怕。
无论如何,首先,我们最好检查一下堆是什么:
取自 Wikipedia,堆是:
则最小元素始终位于根节点中,这会产生最小堆。)几乎,堆是一棵二叉树,任何节点的所有子节点都小于该节点。
现在,堆排序是一种 O(n lg(n)) 排序算法。您可以在此处阅读一些相关内容和< a href="http://en.wikipedia.org/wiki/Heapsort" rel="nofollow">此处。它的工作原理是将要排序的所有元素放入堆中,然后从最大元素到最小元素构建排序数组。您将继续重组堆,并且由于最大的元素始终位于堆的顶部(根),因此您可以继续获取该元素并将其放在排序数组的后面。 (也就是说,您将反向构建排序数组)
为什么这个算法是 O(n lg(n))?因为堆上的所有操作都是 O(lg(n)),因此,您将执行 n 次操作,导致总运行时间为 O( n lg(n))。
我希望我的可怕咆哮对你有帮助!有点啰嗦;对此感到抱歉...
I'll see how I go in answering this, because my explanation for heap sort and what a heap is will be a little...
...uh, terrible.
Anyway, firstly, we'd better check what a Heap is:
As taken from Wikipedia, a heap is:
Pretty much, a heap is a binary tree such that all the children of any node are smaller than that node.
Now, heap sort is an O(n lg(n)) sorting algorithm. You can read a little about it here and here. It pretty much works by putting all the elements of whatever you're trying to sort into a heap, then building the sorted array from the largest element to the smallest. You'll keep on restructuring the heap, and since the largest element is at the top (root) of the heap at all times, you can just keep taking that element and putting it at the back of the sorted array. (That is, you'll build the sorted array in reverse)
Why is this algorithm O(n lg(n))? Because all the operations on a heap are O(lg(n)) and as a result, you'll do n operations, resulting in a total running time of O(n lg(n)).
I hope my terrible rant helped you! It's a little bit wordy; sorry about that...
假设您有一个特殊的数据结构(称为堆),它允许您存储数字列表,并让您在
O(lg n)
时间内检索和删除最小的数字。您是否明白这如何导致一个非常简单的排序算法?
困难的部分(实际上并不难)是实现堆。
Suppose you have a special data structure (called a heap) which allows you to store a list of numbers and lets you retrieve and remove the smallest one in
O(lg n)
time.Do you see how this leads to a very simple sorting algorithm?
The hard part (it's actually not that hard) is implementing the heap.
也许交互式跟踪将帮助您更好地理解算法。这是一个演示。
Maybe interactive tracing will help you understand the algorithm better. Here is a demo.
我记得我的算法分析教授告诉我们,堆排序算法的工作原理就像一堆砾石:
想象一下,你有一个装满砾石的袋子,然后你把它倒在地板上:较大的石头可能会滚到底部,而较小的石头(或沙子)将留在上面。
现在,您取出堆的最顶部并将其保存为堆的最小值。再次将剩余的堆放入袋子中,然后重复。
(或者你可以采取相反的方法,抓住你看到在地板上滚动的最大的石头,这个例子仍然有效)
这或多或少是我所知道的解释堆排序如何工作的简单方法。
I remember my Algorithm analysis professor to tell us that the heap sort algorithm works like a heap of gravel:
Imagine you have a sack filled with gravel, and you empty that on the floor: Bigger stones will likely roll to the bottom and smaller stones (or sand) will stay on top.
You now take the very top of the heap and save it at the smallest value of your heap. Put again the rest of your heap in your sack, and repeat.
(Or you can get the opposite approach and grab the biggest stone you saw rolling on the floor, the example is still valid)
That's it more or less the simple way i know to explain how heap sort works.
堆排序包括最简单的逻辑,时间复杂度为 O(nlogn),空间复杂度为 O(1)
Heap sort include simplest logic with time complexity O(nlogn) and space complexity O(1)