快速排序分区
我试图从这个 网站 理解快速排序算法,paul 的实现速度与stl::sort(大范围快速排序,小范围插入排序)。
我将 Paul 的实现与我的实现进行比较,我的实现比他的实现慢 3 倍。通过分析我们的代码,我发现主要的差异是分区。
以下是paul代码的摘录:
void Partition(int data[], int low , int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
goto QStart;
while(true){
do {
fromHigh--;
if(fromLow >= fromHigh)
goto QEnd;
QStart:;
}while(data[fromHigh] > pivot)
data[ptr] = data[fromHigh];
ptr = fromHigh;
do{
fromLow++;
if(fromLow >= fromHigh){
ptr = fromHigh;
goto QEnd;
}
}while(data[fromLow] < pivot)
data[ptr] = data[fromLow];
ptr = fromLow;
}
QEnd:;
data[ptr] = pivot;
}
以下是我的:
void MyPartition(int data[], int low , int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
while(fromLow != fromHigh){
if(data[fromHigh] >= pivot)
fromHigh--;
else{
data[ptr] = data[fromHigh];
ptr = fromHigh;
while(data[fromLow] <= pivot && fromLow != fromHigh)
fromLow ++;
data[ptr] = data[fromLow];
ptr = fromLow;
}
}
data[ptr] = pivot;
}
这两个函数实现相同的算法,我相信它们具有相同的BigO:
- 首先,从高端到低端(右=>左)扫描数组 找到第一个值小于主元。
- 然后,从低端到高端(左=>右)扫描阵列 找到第一个值大于主元。
- 在任何一次扫描中,如果发现任何东西,那么我们“交换它” 与枢轴值”,这是逻辑交换,因为枢轴值是 使用变量pivot进行缓存,我们可以将变量ptr视为 当前枢轴值位置。
有人知道为什么保罗的实施比我的快吗?
更新:
int PartitionData2(int * data, int low, int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
while(fromLow < fromHigh){
if(data[fromHigh] > pivot) // '>=' ==> '>'
fromHigh--;
else{
data[ptr] =data[fromHigh] ;
ptr = fromHigh;
while(data[++fromLow] < pivot && // '<=' ==> '<'
fromLow != fromHigh);
data[ptr] = data[fromLow];
ptr = fromLow;
fromHigh--; // antti.huima's idea
}
}
data[ptr] = pivot;
return ptr;
}
我只是按照antti.huima的想法更新代码,这使得我的代码和paul的代码一样快。
这让我很困惑,因为它看起来像是交换等于枢轴的值。
更新2: 我理解为什么我们需要移动等于枢轴的元素,因为如果我们不这样做,两个新分区将不均匀,例如应该有一个比另一个大得多。它最终会达到 O(n^2) 的情况。
请参阅此 PDF
I am trying to understand quick sort algorithm from this web site, paul's implementation is as fast as stl::sort(quick sort in the large range, and insertion sort in smaller range).
I compare paul's implementation with mine, mine is 3 times slower than his implementation. By profiling our codes, I find the main diffidence is Partition.
The following is excerpt form paul's code:
void Partition(int data[], int low , int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
goto QStart;
while(true){
do {
fromHigh--;
if(fromLow >= fromHigh)
goto QEnd;
QStart:;
}while(data[fromHigh] > pivot)
data[ptr] = data[fromHigh];
ptr = fromHigh;
do{
fromLow++;
if(fromLow >= fromHigh){
ptr = fromHigh;
goto QEnd;
}
}while(data[fromLow] < pivot)
data[ptr] = data[fromLow];
ptr = fromLow;
}
QEnd:;
data[ptr] = pivot;
}
And following is mine:
void MyPartition(int data[], int low , int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
while(fromLow != fromHigh){
if(data[fromHigh] >= pivot)
fromHigh--;
else{
data[ptr] = data[fromHigh];
ptr = fromHigh;
while(data[fromLow] <= pivot && fromLow != fromHigh)
fromLow ++;
data[ptr] = data[fromLow];
ptr = fromLow;
}
}
data[ptr] = pivot;
}
These two functions implement the same algorithm, and I believe they have the same BigO:
- first, scanning array from high end to low end (right => left) for
finding the first value is less than pivot. - then, scanning array from low end to high end (left => right) for
finding the first value is greater than pivot. - In either scan, if anything is found, then we "swap it
with pivot value", this is logical swap,because pivot value is
cached with variable pivot, we can regard variable ptr as the
current pivot value position.
Does someone know why paul's implementation is fast than mine?
UPDATE:
int PartitionData2(int * data, int low, int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
while(fromLow < fromHigh){
if(data[fromHigh] > pivot) // '>=' ==> '>'
fromHigh--;
else{
data[ptr] =data[fromHigh] ;
ptr = fromHigh;
while(data[++fromLow] < pivot && // '<=' ==> '<'
fromLow != fromHigh);
data[ptr] = data[fromLow];
ptr = fromLow;
fromHigh--; // antti.huima's idea
}
}
data[ptr] = pivot;
return ptr;
}
I just update codes according to idea of antti.huima, this make my codes as fast as paul's codes.
It make me confuse, because it looks like swapping value which equals to pivot.
UPDATE2:
I understand the reason why we need move element which equals to pivot, because if we don't, the two new partitions will be uneven, e.g. there should be one is much larger than another. it eventually go to O(n^2) case.
see this PDF
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的代码中有一些保罗的代码没有的冗余检查。
例如,在线上的
第一次检查在第一次迭代中是多余的,因为它始终认为当您开始此迭代时,在第一次迭代中 data[fromLow] 不高于枢轴。原因是,每当您开始此迭代时,您刚刚交换了一个小于“fromHigh”的主元的值。因为对于随机排序的数组,此迭代仅运行几个循环(对于随机主元,它以 50% 的概率终止),因此与不进行主元比较的 paul 代码相比,实际上您需要进行 25% 的额外比较在限制检查之前,但首先从 Low 递增一次。
在第一个循环中,您会遇到相同的性能错误,即从 High 开始减少,尽管语法上有所不同。保罗的代码没有它......这就是为什么他需要转到 QStart :)
You have some redundant checks in your code that paul's code doesn't have.
For example on the line
the first check is redundant on the first iteration because it always holds that when you start this iteration, on the first iteration data[fromLow] is not higher than the pivot. The reason is that whenever you start this iteration, you have just swapped in a value less than the pivot from 'fromHigh'. Because for a randomly ordered array this iteration is only run for a couple of loops (it terminates with 50% probability for a random pivot), you are in practice doing 25% extra comparisons compared to paul's code which doesn't do the pivot comparison before limit check but increments fromLow first once.
You have the same performance bug in the first loop where you decrease fromHigh, even though it's syntactically different. paul's code doesn't have it... and that's why he needs to goto to QStart :)