当所有元素都相同时,快速排序的复杂性?
我有一个由 N 个相同数字组成的数组。我正在对其应用快速排序。 在这种情况下排序的时间复杂度应该是多少?
我绞尽脑汁地思考这个问题,但没有得到确切的解释。
任何帮助将不胜感激。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
我有一个由 N 个相同数字组成的数组。我正在对其应用快速排序。 在这种情况下排序的时间复杂度应该是多少?
我绞尽脑汁地思考这个问题,但没有得到确切的解释。
任何帮助将不胜感激。
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(6)
这取决于快速排序的实现。传统的实现分为 2 个(
<
和>=
)部分,在相同的输入上将具有O(n*n)
。虽然不一定会发生交换,但它会导致n
次递归调用 - 每个递归调用都需要与枢轴和n-recursionDepth<进行比较/code> 元素。即需要进行
O(n*n)
比较,但是有一个简单的变体,它分为 3 个集合(
<
、=
和<代码>>)。在这种情况下,此变体具有O(n)
性能 - 无需选择主元,而是交换然后在0
上递归到pivotIndex-1
并pivotIndex+1
到n
,它将把所有等于枢轴的东西交换到“中间”分区(在所有相同输入的情况下总是意味着与自身交换即无操作)意味着在这种特殊情况下,调用堆栈的深度只有 1 n 次比较并且不会发生交换。我相信这个变体至少已经进入了 Linux 上的标准库。This depends on the implementation of Quicksort. The traditional implementation which partitions into 2 (
<
and>=
) sections will haveO(n*n)
on identical input. While no swaps will necessarily occur, it will causen
recursive calls to be made - each of which need to make a comparison with the pivot andn-recursionDepth
elements. i.e.O(n*n)
comparisons need to be madeHowever there is a simple variant which partitions into 3 sets (
<
,=
and>
). This variant hasO(n)
performance in this case - instead of choosing the pivot, swapping and then recursing on0
topivotIndex-1
andpivotIndex+1
ton
, it will put swap all things equal to the pivot to the 'middle' partition (which in the case of all identical inputs always means swapping with itself i.e. a no-op) meaning the call stack will only be 1 deep in this particular case n comparisons and no swaps occur. I believe this variant has made its way into the standard library on linux at least.快速排序的性能取决于主元的选择。所选的主元越接近中间元素,快速排序的性能就越好。
在这种特定情况下,您很幸运 - 您选择的主元将始终是中位数,因为所有值都是相同的。因此,快速排序的分区步骤永远不需要交换元素,并且两个指针将恰好在中间相遇。因此,这两个子问题的大小正好是一半 - 为您提供完美的
O(n log n)
。更具体地说,这取决于分区步骤的实施情况。循环不变式只需要确保较小的元素位于左侧子问题中,而较大的元素位于右侧子问题中。无法保证分区实现永远不会交换相等的元素。但这总是不必要的工作,因此没有聪明的实现应该做到这一点:
left
和right
指针永远不会检测到各自枢轴的反转(即,你永远不会遇到这种情况其中*left && *right) ,因此
left
指针将递增,right
指针将递增每一步递减,它们最终会在中间相遇,生成大小为 n/2 的子问题。The performance of quicksort depends on the pivot selection. The closer the chosen pivot is to the median element, the better is quicksort's performance.
In this specific case you're lucky - the pivot you select will always be a median, since all values are the same. The partition step of quicksort will hence never have to swap elements, and the two pointers will meet exactly in the middle. The two subproblems will have therefore be exactly half the size - giving you a perfect
O(n log n)
.To be a little more specific, this depends on how well the partition step is implemented. The loop-invariant only needs to make sure that smaller elements are in the left-hand sub-problem, while greater elements are in the right-hand sub-problem. There's no guarantee that a partition implementation never swaps equal elements. But it is always unnecessary work, so no clever implementation should do it: The
left
andright
pointers will never detect an inversion respective the pivot (i.e. you will never hit the case where*left > pivot && *right < pivot
) and so theleft
pointer will be incremented, theright
pointer will be decremented every step and they will finally meet in the middle, generating subproblems of sizen/2
.这取决于具体的实现。
如果只有一种比较(≤或<)来确定其他元素相对于主元的位置,它们将全部进入其中一个分区,并且您将得到 O(n 2)性能,因为每一步问题大小只会减少 1。
此处列出的算法是一个示例(附图适用于不同的算法)。
如果有两种比较,例如<对于左侧的元素和 >对于右侧的元素,就像两指针实现中的情况一样,并且如果您小心地逐步移动指针,那么您可能会得到完美的 O(n > log n) 性能,因为一半相等的元素将在两个分区中平均分割。
上面链接中的插图使用的算法不会按步骤移动指针,因此性能仍然很差(请查看“很少有唯一”的情况)。
所以这取决于你在实现算法时是否考虑到了这种特殊情况。
实际的实现通常处理更广泛的特殊情况:如果分区步骤中没有交换,他们假设数据几乎已排序,并使用插入排序,这给出了更好的 O(n)所有元素相等的情况。
It depends on the particular implementation.
If there is only one kind of comparison (≤ or <) to determine where the other elements go relative to the pivot, they will all go into one of the partitions, and you will get O(n2) performance, since the problem size will decrease by only 1 each step.
The algorithm listed here is an example (the accompanying illustration are for a different algorithm).
If there are two kinds of comparisons, for example < for elements on the left and > for elements on the right, as is the case in a two-pointer implementation, and if you take care to move the pointers in step, then you might get perfect O(n log n) performance, because half the equal elements will be split evenly in the two partitions.
The illustrations in the link above use an algorithm which doesn't move pointers in step, so you still get poor performance (look at the "Few unique" case).
So it depends whether you have this special case in mind when implementing the algorithm.
Practical implementations often handle a broader special case: if there are no swaps in the partitioning step, they assume the data is nearly sorted, and use an insertion sort, which gives an even better O(n) in the case of all equal elements.
tobyodavies 提供了正确的解决方案。当所有键都相等时,它确实处理这种情况并在 O(n) 时间内完成。
这与我们在荷兰国旗问题中所做的划分相同
http://en.wikipedia.org/wiki/ Dutch_national_flag_problem
分享 Princeton 的代码
http://algs4.cs.princeton .edu/23quicksort/Quick3way.java.html
tobyodavies has provided the right solution. It does handle the case and finish in O(n) time when all the keys are equal.
It is the same partitioning as we do in dutch national flag problem
http://en.wikipedia.org/wiki/Dutch_national_flag_problem
Sharing the code from princeton
http://algs4.cs.princeton.edu/23quicksort/Quick3way.java.html
如果您实现 2 路分区算法,那么每一步数组都会减半。这是因为当遇到相同的键时,扫描就会停止。因此,在每个步骤中,分区元素将位于子数组的中心,从而在每个后续递归调用中将数组减半。现在,这种情况类似于使用
~N lg N
比较对 N 个元素的数组进行排序的归并排序情况。因此,对于重复键,快速排序的传统 2 路分区算法使用 ~N lg N 进行比较,从而遵循线性算术方法。If you implement the 2-way partitioning algorithm then at every step the array will be halved. This is because when identical keys will be encountered, the scan stops. As a result at each step, the partitioning element will be positioned at the center of the subarray thereby halving the array in every subsequent recursive call. Now, this case is similar to the mergesort case which uses
~N lg N
compares to sort an array of N elements. Ergo for duplicate keys, the traditional 2-way partitioning algorithm for Quicksort uses~N lg N
compares, thereby following a linearithmic approach.快速排序代码是使用“分区”和“快速排序”函数完成的。
基本上,有两种实现快速排序的最佳方法。
这两者之间的区别仅在于“分区”功能,
1.Lomuto
2.Hoare
使用诸如上述 Lomuto 分区方案之类的分区算法(即使选择良好的主元值),快速排序对输入表现出较差的性能包含许多重复的元素。当所有输入元素相等时,问题就很明显:在每次递归时,左侧分区为空(没有输入值小于主元),而右侧分区仅减少了一个元素(主元被删除)。因此,Lomuto 分区方案需要二次时间来对相等值的数组进行排序。
因此,使用 Lomuto 分区算法需要 O(n^2) 时间。
通过使用霍尔分区算法,我们得到了所有数组元素相等的最佳情况。时间复杂度为O(n)。
参考:https://en.wikipedia.org/wiki/Quicksort
Quick Sort code is done using "partition" and "quicksort" functions.
Basically, there are two best ways for implementing Quicksort.
The difference between these two is only the "partition" function,
1.Lomuto
2.Hoare
With a partitioning algorithm such as the Lomuto partition scheme described above (even one that chooses good pivot values), quicksort exhibits poor performance for inputs that contain many repeated elements. The problem is clearly apparent when all the input elements are equal: at each recursion, the left partition is empty (no input values are less than the pivot), and the right partition has only decreased by one element (the pivot is removed). Consequently, the Lomuto partition scheme takes quadratic time to sort an array of equal values.
So, this takes O(n^2) time using the Lomuto partition algorithm.
By using the Hoare partition algorithm we get the best case with all the array elements equal. The time complexity is O(n).
Reference: https://en.wikipedia.org/wiki/Quicksort