对 CLRS 随机构建的二叉搜索树证明中的主张感到困惑
不确定我是否应该把它放在 math stackexchange 上,但是哦,好吧。
在 CLRS 第 300 页...
Theorem 12.4
The expected height of a randomly built binary search tree on n distinct keys is O(lgn).
他们定义了 3 个随机变量...
'Xn' is the height of a randomly built binary search on n keys.
'Yn' is the "exponential height", where Yn = 2^(Xn)
'Rn' is the position that the root key would occupy if the key's were sorted, its rank.
以及指标随机变量 Zn,1 Zn,2 Zn,3 ... Zn,n
...
'Zn,i = I{Rn = i}'
所以他们继续做出证明(见正文),但在证明过程中他们提出以下主张……
We claim that the indicator random variable Zn,i = I{Rn = i} is independent of the
values of Yi-1 and Yn-i. Having chosen Rn = i, the left subtree (whose exponential
height is Yi-1) is randomly built on the i-1 keys whose ranks are less than i. This
subtree is just like any other randomly built binary search tree on i-1 keys.
Other than the number of keys it contains, this subtree's structure is not affected
at all by the choice of Rn = i, and hence the random variables Yi-1 and Zn,i are
independent.
Yn-i 也是如此。我的问题是那部分,除了它包含的键的数量... 是的,子树的结构不受 Rn 影响,但事实上 Rn 影响 子树中键的数量似乎暗示了一种依赖关系,因为它限制了高度 的子树。
我显然错过了一些关键的关系。如有任何帮助,我们将不胜感激,谢谢。
Not sure if I should put this on math stackexchange instead, but oh well.
On page 300 of CLRS...
Theorem 12.4
The expected height of a randomly built binary search tree on n distinct keys is O(lgn).
They define 3 random variables...
'Xn' is the height of a randomly built binary search on n keys.
'Yn' is the "exponential height", where Yn = 2^(Xn)
'Rn' is the position that the root key would occupy if the key's were sorted, its rank.
And indicator random variables Zn,1 Zn,2 Zn,3 ... Zn,n
...
'Zn,i = I{Rn = i}'
So they go on to make the proof (see the text), but in the midst of it they make the following claim...
We claim that the indicator random variable Zn,i = I{Rn = i} is independent of the
values of Yi-1 and Yn-i. Having chosen Rn = i, the left subtree (whose exponential
height is Yi-1) is randomly built on the i-1 keys whose ranks are less than i. This
subtree is just like any other randomly built binary search tree on i-1 keys.
Other than the number of keys it contains, this subtree's structure is not affected
at all by the choice of Rn = i, and hence the random variables Yi-1 and Zn,i are
independent.
Likewise for Yn-i. My issue is that part, Other than the number of keys it contains...
Yes, the structures of the subtrees are unaffected by Rn, but the fact that Rn affects the
number of keys in the subtrees seems to imply a dependence due to how it limits the height
of the subtrees.
I'm obviously missing some key relationship. Any help is appreciated, thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对于预期时间证明,您可以将每次插入视为一个独立事件。插入器不是对抗性的(即不会尝试破坏您的二叉树)。现在,如果这确实是随机的,那么您可以考虑将要插入的每个其他(每个奇数或每个偶数)值作为坏节点插入。坏节点是导致树高度增加的节点。坏节点之间有好节点。
如果您已经有一棵高度为“h”的树(考虑它有 O(2^h) 个节点),那么它将有 O(2^(h-1)) 个节点作为叶子(大约一半的节点是叶子) 。新值去任何地方(即作为任何这些节点的子节点)的概率相同。预计其中一半将成为叶子的左子节点(将叶子节点的高度增加 1),另一半将成为该叶子的右子节点。这给出了树的预期 O(log n) 高度。因此,这样一棵树上的每个操作的成本都是 O(log n)。
For the expected time proof, you can think of every insertion to be an independent event. The inserter is not adversarial (i.e. doesn't try to break your binary tree). Now, if this is truly random, then you can consider every other (every odd or every even) value to be inserted as being inserted as a bad node. Bad nodes are ones that cause the tree to increase in height. Bad nodes have good nodes between them.
If you already have a tree of height 'h' (consider it has O(2^h) nodes), then it will have O(2^(h-1)) nodes as leaves (approximately half the total nodes are leaves). There is an equal probability of a new value going anywhere (i.e. as a child of any of those nodes). It is expected that half of them will become left child of a leaf (increasing the height of the leaf node by 1) and the other half will become the right child of that leaf. This gives an expected O(log n) height to the tree. Hence, every operation on such a tree costs O(log n).