证明优化归并排序的运行时间是 theta(NK + Nlog(N/K))？

发布于 2024-10-14 23:03:57 字数 156 浏览 10 评论 0原文

好的，我知道合并排序的最坏情况时间为 theta(NlogN)，但其开销很高，并且出现在进行合并的递归树底部附近。有人建议，一旦大小达到 K，我们就停止递归，并在此时切换到插入排序。我需要证明这个修改后的递推关系的运行时间是theta(NK + Nlog(N/k))？我对如何解决这个问题感到空白。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

柠檬色的秋千 2024-10-21 23:03:57

也许一个好的开始是查看这个问题的递归关系。我想象对于典型的合并排序，它看起来像这样：

T(N) = 2 * T(N / 2) + N

即，您将问题分为 2 个大小一半的子问题，然后执行 N 工作（合并）。我们有一个需要恒定时间的基本情况。

将其建模为一棵树：

T(N)   =   N   ->   T(N / 2)   
               ->   T(N / 2)

       =   N   ->   (N / 2)   ->   T(N / 4)
                              ->   T(N / 4)
               ->   (N / 2)   ->   T(N / 4)
                              ->   T(N / 4)

这给出了的扩展

T(N) = N + 2N/2 + 4N/4 + ...
     = N + N + N ...

所以实际上我们只需要看看它有多深。我们知道第 i 层处理大小为 N / 2^i 的子问题。所以我们的叶节点 (T(1)) 出现在 L 级别，其中 N / 2^L = 1：

N / 2^L = 1
N = 2^L
log N = log(2^L)
log N = L

所以我们的运行时间是 >N log N。

现在我们介绍插入排序。我们的树看起来像这样

T(N) = ... -> I(K)
           -> I(K)
             ...x N/K

换句话说，我们将在某种程度上L必须解决N/K大小为K的插入排序问题。插入排序的最坏情况运行时间为 K^2。因此，在叶子处，我们总共需要完成这么多工作：

(N / K) * I(K)
= (N / K) * K * K
= N * K

但是我们还需要进行大量合并，每层树的成本为 N，如下所示：之前解释过。回到我们之前的方法，让我们找到L（到达大小为K的子问题并因此切换到插入之前的级别数）：

N / 2^L = K
N / K = 2^L
L = log (N/K)

所以我们总共

O(N) = N * K + N * log (N/K)

有自从我用算法给你提供证明草图以来已经太久了，但这应该会让你的神经元兴奋起来。

Maybe a good start is to look at the recurrence relation for this problem. I imagine for typical mergesort it would look something like this:

T(N) = 2 * T(N / 2) + N

i.e. you are dividing the problem into 2 subproblems of half the size, and then performing N work (the merge). We have a base case that takes constant time.

Modelling this as a tree we have:

T(N)   =   N   ->   T(N / 2)   
               ->   T(N / 2)

       =   N   ->   (N / 2)   ->   T(N / 4)
                              ->   T(N / 4)
               ->   (N / 2)   ->   T(N / 4)
                              ->   T(N / 4)

This gives an expansion of

T(N) = N + 2N/2 + 4N/4 + ...
     = N + N + N ...

So really we just need to see how deep it goes. We know that the ith level operates on subproblems N / 2^i in size. So our leaf nodes (T(1)) occur on level L where N / 2^L = 1:

N / 2^L = 1
N = 2^L
log N = log(2^L)
log N = L

So our runtime is N log N.

Now we introduce insertion sort. Our tree will look something like this

T(N) = ... -> I(K)
           -> I(K)
             ...x N/K

In other words, we will at some level L have to solve N/K insertion sort problems of size K. Insertion sort has a worst-case runtime of K^2. So at the leaves we have this much work in total:

(N / K) * I(K)
= (N / K) * K * K
= N * K

But we also have a bunch of merging to do as well, at a cost of N per level of the tree as explained before. Going back to our previous method, let's find L (the number of levels before we reach subproblems of size K and thus switch to insertion):