MAX-HEAPIFY 中的最坏情况:“最坏情况发生在树的底层正好是半满时”
在MAX-HEAPIFY中给出
"the worst case occurs when the bottom level of the tree is exactly half full"
在CLRS,第三版,第155页中,我猜想 原因是在这种情况下,Max-Heapify 必须通过左子树“向下浮动”。
但我不明白的是“为什么是半满”?
如果左子树只有一颗叶子,Max-Heapify 也可以向下浮动。那么为什么不把这视为最坏的情况呢?
In CLRS, third Edition, on page 155, it is given that in MAX-HEAPIFY,
"the worst case occurs when the bottom level of the tree is exactly half full"
I guess the reason is that in this case, Max-Heapify has to "float down" through the left subtree.
But the thing I couldn't get is "why half full" ?
Max-Heapify can also float down if left subtree has only one leaf. So why not consider this as the worst case ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
阅读整个上下文:
因为运行时间
T(n)
是通过中的元素数量来分析的树(n
),并且递归进入其中一个子树,我们需要找到子树中节点数量相对于n
的上限,这会产生T(n) = T(子树中的最大节点数) + O(1)
子树中节点数最坏的情况是最后一行在一侧尽可能满,另一方面尽可能空。这称为半满。左子树的大小将以
2n/3
为界。如果您提出的情况只有几个节点,那么这是无关紧要的,因为所有基本情况都可以被视为
O(1)
并被忽略。Read the entire context:
Since the running time
T(n)
is analysed by the number of elements in the tree (n
), and the recursion steps into one of the subtrees, we need to find an upper bound on the number of nodes in a subtree, relative ton
, and that will yield thatT(n) = T(max num. nodes in subtree) + O(1)
The worst case of number of nodes in a subtree is when the final row is as full as possible on one side, and as empty as possible on the other. This is called half full. And the left subtree size will be bounded by
2n/3
.If you're proposing a case with only a few nodes, then that's irrelevant, since all base cases can be considered
O(1)
and ignored.已经有一个公认的答案,但这个答案是为那些仍然有点困惑的人(就像我一样),或者仍然没有点击的东西。这是更长、更详细的解释。
虽然这听起来可能多余,但我们必须非常清楚确切的定义,因为通过我们对细节的关注……当你这样做时,证明事情可能会变得容易得多。
从 CLRS(第 6.1 节)来看,二叉堆数据结构是一个数组对象,可以将其视为几乎完整二叉树
来自 维基百科,在完全二叉树中,每个级别(可能除了最后一个级别)都是完全的填充,并且最后级别中的所有节点都尽可能左侧。
再次,来自 维基百科,平衡二叉树是二叉树结构,其中每个节点的左子树和右子树的高度相差不超过1。
现在我们已经武装好了,让我们开始吧。
所以,与根相比,左右子树最多可以相差1。
让我们考虑一棵树 T,令左子树的高度 = h+1,右子树的高度 = h
MAX_HEAPIFY 中最坏的情况会是什么?当我们在尝试维护堆属性的同时最终进行最大数量的比较和交换时,就会发生最坏的情况。
当 MAX_HEAPIFY 算法运行时,如果它递归地遍历最长路径,那么我们可以考虑可能的最坏情况,因为它将最终在最长路径中进行最大数量的比较和交换。
好吧,似乎所有最长的路径都恰好位于左子树中(因为它的高度为 h+1)。但有人可能会问:为什么不是正确的子树呢?请记住上面的定义,最后级别中的所有节点都必须尽可能左侧。
现在,因为我们必须涵盖可能导致最坏情况的所有可能性,所以我们需要获得更多数量的较长路径(如果存在),因此,我们应该将 left 子-tree FULL (但是为什么?这样我们就可以获得更多路径可供选择,并选择给出最坏情况时间的路径)。
由于左子树的高度为 h+1,因此它将有 2^(h+1) 个数。叶节点,因此,从根开始的路径数为 2^(h+1)。这是高度为 h+1 的树 T 中可能路径的最大数量。
注意:如果您仍在阅读,请继续阅读,也许只是为了清晰起见。
这是最坏情况下树结构的图像。
在上图中,如您所见,左子树(黄色)和右子树(粉红色)各有 x 个节点。粉色部分是完整的右子树,黄色部分是不包括最后一层的左子树。
请注意,左侧(黄色)和右侧(粉色)子树的高度均为 h。
现在,从一开始,我们就认为左子树整体的高度为h+1(即包括黄色部分和最后一层)。
现在,如果我可以问,我们必须在最后一层(即黄色部分下方)添加多少个节点才能使左子树完全充满?
好吧,黄色部分的最底层有 ⌈x/2⌉ 个节点(即具有 n 个节点的树/子树中的叶子总数 = ⌈n/2⌉;要进行证明,请访问 此链接),现在如果我们添加这些节点或叶子中的每一个都有 2 个子节点 =>总共添加了 x (≈x) 个节点(如何?⌈x/2⌉ 叶子 * 2 ≈ x 个节点)。
通过这一添加,我们使高度为 h+1 的左子树(即高度为 h 的黄色部分和添加的最后一层)为 FULL,从而满足最坏情况的标准。
由于左子树是FULL,所以整棵树是HALF FULL。
现在有人可能会问:如果我们添加更多节点怎么办,或者具体来说,如果我们在右子树中添加节点怎么办?嗯,我们不。这是因为现在如果我们碰巧添加更多节点,这些节点将被添加到右子树中(因为左子树已满),这反过来又会更加平衡树。现在,随着树开始变得更加平衡,我们倾向于朝着最好的情况而不是最坏的情况发展。
最后一个问题:我们总共有多少个节点?
树中的节点总数,n = x(来自黄色部分)+ x(来自粉红色部分)+ x(添加黄色部分下面的最后一层)= 3x
你能注意到什么吗?作为副产品,左子树总共包含最多 2x 个节点,即 2n/3 节点 (bcoz x = n/3)。
Already there's an accepted answer but this answer is for those people who are still a bit confused (as I was), or something still doesn't click. So here's a little bit longer and more detailed explanation.
Though it might sound redundant, we have to be very clear about the exact definitions because through our attention to the details... chances are when you do that proving things becomes much easier.
From CLRS (section 6.1), a Binary Heap data structure is an array object that can be viewed as a nearly complete binary tree
From Wikipedia, In a complete binary tree, every level (except possibly the last level) is completely filled, and all the nodes in the last level are as far left as possible.
Again, from Wikipedia, A balanced binary tree is a binary tree structure in which the left and right sub-trees of every node differ in height by no more than 1.
Now that we are armed, let's dive in.
So, in comparison to the root, the height of the left and right sub-tree can differ by 1 at most.
Let's consider a tree T and let the height of the left sub-tree = h+1 and the height of the right sub-tree = h
What can be the worst-case in MAX_HEAPIFY? The worst-case happens when we end up doing maximum number of comparisons and swaps while trying to maintain the heap property.
When the MAX_HEAPIFY algorithm runs and if it recursively goes through the longest path then we can consider a possible worst-case because it will end up doing the maximum number of comparisons and swaps in the longest path.
Well, it seems all of the longest paths happen to be in the left sub-tree (as its height is h+1). But someone might as well ask: Why not the right sub-tree? Remember the above definition, all the nodes in the last level have to be as far left as possible.
Now because we have to cover every possibility that can lead to a worst-case, we need to get more number of longer paths, if any exist, and because of that, we ought to make the left sub-tree FULL (But Why? So that we can get more paths to choose from and opt for the path that gives the worst-case time among all).
Since the left subtree has a height h+1, it will have 2^(h+1) no. of leaf nodes, and, therefore, 2^(h+1) number of paths from the root. This is the maximum number of possible paths in a tree T of h+1 height.
Note: Please hold on to it if you are still reading, maybe just for the sake of crystal clarity.
Here's the image of the tree structure in the worst-case situation.
In the above image, as you can see, consider that the left (in yellow) sub-tree and the right (in pink) sub-tree each has x nodes. The pink portion is a complete right sub-tree and the yellow portion is the left sub-tree excluding the last level.
Notice that both the left (yellow) and the right (pink) sub-trees have a height of h.
Now, from the start, we have considered the left subtree to be of height h+1 as a whole (i.e. including the yellow portion and the last level).
Now, if I may ask, how many nodes do we have to add in the last level i.e. below the yellow portion to make the left sub-tree completely FULL?
Well, the bottom-most layer of the yellow portion has ⌈x/2⌉ nodes (i.e. Total number of leaves in a tree/subtree having n nodes = ⌈n/2⌉; for proof visit this link), and now if we add 2 children to each of these nodes or leaves => total x (≈x) nodes have been added (How? ⌈x/2⌉ leaves * 2 ≈ x nodes).
With this addition, we make the left sub-tree of height h+1 (i.e. the yellow portion with height h and the one last level added) FULL, hence meeting the worst-case criteria.
Since the left sub-tree is FULL, the whole Tree is HALF FULL.
Now someone might as well ask: What if we add more nodes, or, specifically, what if we add nodes in the right sub-tree? Well, we don't. And that's because now if we happen to add more nodes, the nodes will be added in the right sub-tree (as the left sub-tree is FULL), which, in turn, will tend to balance out the tree more. Now as the tree is starting to get more balanced, we are tending to move towards the best-case scenario and not the worst-case.
Final question : How many nodes do we have in total?
Total nodes in the tree, n = x (from the yellow portion) + x (from the pink portion) + x (addition of the last level below the yellow portion) = 3x
Can you notice something? As a by-product, the left sub-tree in total contains at most 2x nodes i.e. 2n/3 nodes (bcoz x = n/3).