算法分析(复杂度)
算法是如何分析的?是什么使得快速排序具有 O(n^2)
最坏情况性能,而合并排序具有 O(n log(n))
最坏情况性能?
How are algorithms analyzed? What makes quicksort have an O(n^2)
worst-case performance while merge sort has an O(n log(n))
worst-case performance?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这是整个学期的主题。最终,我们讨论的是算法完成之前必须完成的操作数量的上限,作为输入大小的函数。我们不包括系数(即 10N 与 4N^2),因为对于 N 足够大,它不再重要。
如何证明算法的关键之处可能相当困难。它需要正式的证明并且有很多技术。通常,一种好的临时方法是仅计算算法对数据进行了多少次传递。例如,如果您的算法具有嵌套的 for 循环,则对于 N 个项目中的每一个,您都必须操作 N 次。这通常是 O(N^2)。
至于合并排序,您一遍又一遍地将数据分成两半。这需要 log2(n)。对于每个分割,您都会传递数据,这会给出 N log(n)。
快速排序有点棘手,因为在平均情况下它也是 n log (n)。您必须想象一下,如果您的分区将数据拆分,使得每次您只在分区的一侧获得一个元素,会发生什么情况。然后你需要将数据分割 n 次,而不是 log(n) 次,这使得它成为 N^2。快速排序的优点是它可以就地完成,并且我们通常更接近 N log(n) 的性能。
That's a topic for an entire semester. Ultimately we are talking about the upper bound on the number of operations that must be completed before the algorithm finishes as a function of the size of the input. We do not include the coeffecients (ie 10N vs 4N^2) because for N large enough, it doesn't matter anymore.
How to prove what the big-oh of an algorithm is can be quite difficult. It requires a formal proof and there are many techniques. Often a good adhoc way is to just count how many passes on the data the algorithm makes. For instance, if your algorithm has nested for loops, then for each of N items you must operate N times. That would generally be O(N^2).
As to merge sort, you split the data in half over and over. That takes log2(n). And for each split you make a pass on the data, which gives N log(n).
quick sort is a bit trickier because in the average case it is also n log (n). You have to imagine what happens if your partition splits the data such that every time you get only one element on one side of the partition. Then you will need to split the data n times instead of log(n) times which makes it N^2. The advantage of quicksort is that it can be done in place, and that we usually get closer to N log(n) performance.
这是算法课程材料的介绍性分析。
定义一个运算(即乘法)并根据空间或时间进行分析。
此操作以空间或时间来计算。通常,将时间作为输入大小的因变量来执行分析。
伪代码示例:
将执行 n 次操作,其中 n 是
@list
的大小。不信你自己数一下。分析快速排序和合并排序需要相当高的数学复杂度。松散地,您可以求解从递归关系导出的离散微分方程。
This is introductory analysis of algorithms course material.
An operation is defined (ie, multiplication) and the analysis is performed in terms of either space or time.
This operation is counted in terms of space or time. Typically analyses are performed as Time being the dependent variable upon Input Size.
Example pseudocode:
There will be n operations performed, where n is the size of
@list
. Count it yourself if you don't believe me.To analyze quicksort and mergesort requires a decent level of what is known as mathematical sophistication. Loosely, you solve a discrete differential equation derived from the recursive relation.
quicksort 和 归并排序 将数组分成两部分,对每一部分进行递归排序,然后合并结果。快速排序通过选择“主元”元素并将数组划分为比主元更小或更大的元素来进行拆分。归并排序任意分割,然后在线性时间内合并结果。在这两种情况下,单个步骤都是 O(n),并且如果数组大小每次减半,这将给出对数的步骤数。所以我们期望 O(n log(n))。
然而,快速排序有一个最坏的情况,即分割总是不均匀的,因此您不会得到与 n 的对数成正比的步数,而是与 n 成正比的步数。归并排序完全分成两半(或尽可能接近),因此它不存在这个问题。
Both quicksort and merge sort split the array into two, sort each part recursively, then combine the result. Quicksort splits by choosing a "pivot" element and partitioning the array into smaller or greater then the pivot. Merge sort splits arbitrarily and then merges the results in linear time. In both cases a single step is O(n), and if the array size halves each time this would give a logarithmic number of steps. So we would expect O(n log(n)).
However quicksort has a worst case where the split is always uneven so you don't get a number of steps proportional to the logarithmic of n, but a number of steps proportional to n. Merge sort splits exactly into two halves (or as close as possible) so it doesn't have this problem.
如果输入数组已排序,那么快速排序将只是一种选择排序!
因为你并没有真正划分数组..你只是在每个循环中选择第一个项目
另一方面,合并排序将始终以相同的方式划分输入数组,无论其内容如何!
另请注意:当划分长度几乎相等时,分而治之的最佳性能!
If the input array is sorted then Quick sort will be only a kind of selection sort!
Because you are not really dividing the array.. you are only picking first item in each cycle
On the other hand merge sort will always divide the input array in the same manner, regardless of its content!
Also note: the best performance in divide and conquer when divisions length are -nearly- equal !
分析算法是一项艰苦的工作,而且很容易出错。我会把它与这样的问题进行比较:在桥牌游戏中我有多少机会拿到两张 A。人们必须仔细考虑所有的可能性,并且不能忽视A可以以任何顺序到达。
因此,分析这些算法时要做的就是检查算法的实际伪代码,并添加最坏情况下会出现的结果。下面我将用大画笔进行绘画。
对于快速排序,必须选择一个枢轴来分割集合。如果运气极差,则该集合每次都会分为 n-1 组和 1 组,共 n 个步骤,其中每个步骤意味着检查 n 个元素。这到达 N^2
对于合并排序,首先将序列拆分为按顺序排列的序列。即使在最坏的情况下,这也意味着最多 n 个序列。这些可以两两组合,然后较大的集合两两组合,等等。然而,那些(最多)n/2个第一个组合处理极小的子集,最后一步处理大小约为n的子集,但是只有一个这样的步骤。得出 N.log(N)
Analysing algorithms is a painstaking effort, and it is error-prone. I would compare it with a question like, how much chance do I have to get dealt two aces in a bridge game. One has to carefully consider all possibilities and must not overlook that the aces can arrive in any order.
So what one does for analysing those algorithms is going through an actual pseudo code of the algorithm and add what result a worst case situation would have. In the following I will paint with a large brush.
For quicksort one has to choose a pivot to split the set. In a case of dramatic bad luck the set splits in a set of n-1 and a set of 1 each time, for n steps, where each steps means inspecting n elements. This arrive at N^2
For merge sort one starts by splitting the sequence into in order sequences. Even in the worst case that means at most n sequences. Those can be combined two by two, then the larger sets are combined two by two etc. However those (at most) n/2 first combinations deal with extremely small subsets, and the last step deals with subsets that have about size n, but there is just one such step. This arrives at N.log(N)