为什么这个快速排序有效？

发布于 2024-09-02 19:41:57 字数 2550 浏览 1 评论 0原文

我发现这种快速排序分区方法令人困惑且错误，但它似乎有效。我指的是这个伪代码。注意：他们在文章末尾也有一个 C 实现，但它与他们的伪代码有很大不同，所以我不关心它。

我也像这样用 C 编写了它，尝试尽可能忠实于伪代码，即使这意味着做一些奇怪的 C 东西：

#include <stdio.h>

int partition(int a[], int p, int r)
{
    int x = a[p];

    int i = p - 1;
    int j = r + 1;

    while (1)
    {
        do
            j = j - 1;
        while (!(a[j] <= x));

        do
             i = i + 1;
        while (!(a[i] >= x));

        if (i < j)
        {
            int t = a[i];
            a[i] = a[j];
            a[j] = t;
        }
        else
        {
            for (i = 1; i <= a[0]; ++i)
                printf("%d ", a[i]);
            printf("- %d\n", j);

            return j;
        }
    }
}


int main()
{
    int a[100] = //{8, 6,10,13,15,8,3,2,12};
                {7, 7, 6, 2, 3, 8, 4, 1};

    partition(a, 1, a[0]);
    return 0;
}

如果你运行这个，你将得到以下输出：

1 6 2 3 4 8 7 - 5

但是，这是错误的，不是吗？显然，a[5] 之前的所有值都不低于它，因为 a[2] = 6 > > a[5] = 4。更不用说 7 应该是枢轴（最初的 a[p]），但它的位置既不正确又丢失。

以下分区算法取自 wikipedia：

int partition2(int a[], int p, int r)
{
    int x = a[r];
    int store = p;

    for (int i = p; i < r; ++i)
    {
        if (a[i] <= x)
        {
            int t = a[i];
            a[i] = a[store];
            a[store] = t;

            ++store;
        }
    }

    int t = a[r];
    a[r] = a[store];
    a[store] = t;

    for (int i = 1; i <= a[0]; ++i)
        printf("%d ", a[i]);
    printf("- %d\n", store);

    return store;
}

并产生以下输出：

1 6 2 3 8 4 7 - 1

我认为这是正确的结果：枢轴 (a[r] = a[7]) 已到达其最终位置。

但是，如果我在以下算法中使用初始分区函数：

void Quicksort(int a[], int p, int r)
{
    if (p < r)
    {
        int q = partition(a, p, r); // initial partitioning function

        Quicksort(a, p, q);
        Quicksort(a, q + 1, r); // I'm pretty sure q + r was a typo, it doesn't work with q + r.
    }
}

...这似乎是一个正确的排序算法。我在很多随机输入上测试了它，包括长度为 20 的所有 0-1 数组。

我还尝试使用此分区函数进行选择算法，但它未能产生正确的结果。它似乎有效，而且作为快速排序算法的一部分，它甚至非常快。

所以我的问题是：

任何人都可以发布一个该算法不起作用的示例吗？
如果没有，为什么它可以工作，因为分区部分似乎是错误的？这是另一种我不知道的分区方法吗？

原文

I find this Quicksort partitioning approach confusing and wrong, yet it seems to work. I am referring to this pseudocode. Note: they also have a C implementation at the end of the article, but it's very different from their pseudocode, so I don't care about that.

I have also written it in C like this, trying to stay true to the pseudocode as much as possible, even if that means doing some weird C stuff:

#include <stdio.h>

int partition(int a[], int p, int r)
{
    int x = a[p];

    int i = p - 1;
    int j = r + 1;

    while (1)
    {
        do
            j = j - 1;
        while (!(a[j] <= x));

        do
             i = i + 1;
        while (!(a[i] >= x));

        if (i < j)
        {
            int t = a[i];
            a[i] = a[j];
            a[j] = t;
        }
        else
        {
            for (i = 1; i <= a[0]; ++i)
                printf("%d ", a[i]);
            printf("- %d\n", j);

            return j;
        }
    }
}


int main()
{
    int a[100] = //{8, 6,10,13,15,8,3,2,12};
                {7, 7, 6, 2, 3, 8, 4, 1};

    partition(a, 1, a[0]);
    return 0;
}

If you run this, you'll get the following output:

1 6 2 3 4 8 7 - 5

However, this is wrong, isn't it? Clearly a[5] does not have all the values before it lower than it, since a[2] = 6 > a[5] = 4. Not to mention that 7 is supposed to be the pivot (the initial a[p]) and yet its position is both incorrect and lost.

The following partition algorithm is taken from wikipedia:

int partition2(int a[], int p, int r)
{
    int x = a[r];
    int store = p;

    for (int i = p; i < r; ++i)
    {
        if (a[i] <= x)
        {
            int t = a[i];
            a[i] = a[store];
            a[store] = t;

            ++store;
        }
    }

    int t = a[r];
    a[r] = a[store];
    a[store] = t;

    for (int i = 1; i <= a[0]; ++i)
        printf("%d ", a[i]);
    printf("- %d\n", store);

    return store;
}

And produces this output:

1 6 2 3 8 4 7 - 1

Which is a correct result in my opinion: the pivot (a[r] = a[7]) has reached its final position.

However, if I use the initial partitioning function in the following algorithm:

void Quicksort(int a[], int p, int r)
{
    if (p < r)
    {
        int q = partition(a, p, r); // initial partitioning function

        Quicksort(a, p, q);
        Quicksort(a, q + 1, r); // I'm pretty sure q + r was a typo, it doesn't work with q + r.
    }
}

... it seems to be a correct sorting algorithm. I tested it out on a lot of random inputs, including all 0-1 arrays of length 20.

I have also tried using this partition function for a selection algorithm, in which it failed to produce correct results. It seems to work and it's even very fast as part of the quicksort algorithm however.

So my questions are:

Can anyone post an example on which the algorithm DOESN'T work?
If not, why does it work, since the partitioning part seems to be wrong? Is this another partitioning approach that I don't know about?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

感情洁癖 2024-09-09 19:41:57

我认为分区是正确的。 7是枢轴。原始数组被划分为一个长度为 5 的子数组（包含小于或等于 7 的元素）和一个长度为 2 的子数组（包含大于或等于 7 的元素）。

回复收藏 0 原文

初相遇 2024-09-09 19:41:57

从上面延伸这里应该是这样的

void swap(int *a, int *b)
{
    int x;

    x = *a;
    *a = *b;
    *b = x;
}

int partition(int s[], int l, int h) 
{ 
    int i;
    int p;/* pivot element index */ 
    int firsthigh;/* divider position for pivot element */ 

    p = h; 
    firsthigh = l; 
    for (i = l; i < h; i++) 
        if(s[i] < s[p]) { 
            swap(&s[i], &s[firsthigh]); 
            firsthigh++; 
        } 
    swap(&s[p], &s[firsthigh]); 

    return(firsthigh); 
}

void quicksort(int s[], int l, int h)
{
    int p;/* index of partition */ 
    if ((h - l) > 0) {
        p = partition(s, l, h); 
        quicksort(s, l, p - 1); 
        quicksort(s, p + 1, h); 
    } 
}

int main()     
{         
    int a[100] = //{8, 6,10,13,15,8,3,2,12};  
                   {7, 7, 6, 2, 3, 8, 4, 1};              
    quicksort(a, 0, 7);
    return 0; 
}

Extending on from above here is what it should look like

void swap(int *a, int *b)
{
    int x;

    x = *a;
    *a = *b;
    *b = x;
}

int partition(int s[], int l, int h) 
{ 
    int i;
    int p;/* pivot element index */ 
    int firsthigh;/* divider position for pivot element */ 

    p = h; 
    firsthigh = l; 
    for (i = l; i < h; i++) 
        if(s[i] < s[p]) { 
            swap(&s[i], &s[firsthigh]); 
            firsthigh++; 
        } 
    swap(&s[p], &s[firsthigh]); 

    return(firsthigh); 
}

void quicksort(int s[], int l, int h)
{
    int p;/* index of partition */ 
    if ((h - l) > 0) {
        p = partition(s, l, h); 
        quicksort(s, l, p - 1); 
        quicksort(s, p + 1, h); 
    } 
}

int main()     
{         
    int a[100] = //{8, 6,10,13,15,8,3,2,12};  
                   {7, 7, 6, 2, 3, 8, 4, 1};              
    quicksort(a, 0, 7);
    return 0; 
}

回复收藏 0 原文

韬韬不绝 2024-09-09 19:41:57

来自 Wikipedia （我已经强调了我认为直接解决您问题的部分）：

这是就地分区
算法。它划分了该部分
索引 left 和之间的数组的
正确的，包容性的，通过移动所有
元素小于或等于
array[pivotIndex] 到开头
子数组，留下所有更大的
跟随他们的元素。在
过程中它也找到了最终的
枢轴元素的位置，其中
它返回。它暂时移动
将元素旋转到末尾
子数组，这样它就不会进入
因为它只使用
交换，最终名单相同
元素作为原始列表。注意
一个元素可以被交换
在达到其值之前多次
最终名额。还应该注意的是
如果数据透视重复
输入数组，它们可以传播
穿过左子数组，可能在
随机顺序。这并不代表
分区失败，进一步
排序将重新定位，最后
将它们“粘”在一起。

这可能就是你所缺少的吗？

回复收藏 0 原文

老娘不死你永远是小三 2024-09-09 19:41:57

您对项目的索引和 iten 值感到困惑

看看您的标头

int partition(int a[], int p, int r) ;

现在，如果我们将数组 a 上的数据类型更改为某种奇怪的数据类型，您将看到问题

int partition( Otherdatatype a[], int p, int r) ;

您从 main 中调用该函数，请

partition(a, 1, a[0]);

参阅问题a[0]是a[0]中条目的值而不是索引值。

想象一下，代码中 a[0] 的值为 200，只需将第一项值更改为 200，您将收到运行时错误“尝试访问超出范围的内存”，因为如果您遵循
通过 a[0] = 200 作为值 r 传递到分区，然后跟踪分区内发生的情况。

要记住的是，这是分区标头中的排序例程，数组 a 中的列表可能与索引的类型不同。标头的 p 和 r 显然是引用索引位置的索引，而 a 是列表待排序。

因此，你的排序的主要起点是

partition(a, 0, items_in_array-1);

你明白为什么吗？数组 a 从 a[0] ... a[items_in_array-1] 运行，

因此在上面的示例中，您已将 8 个值预加载到数组中，因此来自 main 的分区调用应该是

partition(a, 0, 7);

You are getting confused between the index of the item and the iten value

Look at your header

int partition(int a[], int p, int r) ;

Now if we changed the data type on the array a to some weird data type you will see the problem

int partition( Otherdatatype a[], int p, int r) ;

You call the function from within your main with

partition(a, 1, a[0]);

See the problem a[0] is the value of the entry in a[0] not an index value.

Imagine a[0] had the value 200 in your code simply change the first item value to 200 and you will get a runtime error "attempt to access memory out of range" because if you follow
thru a[0] = 200 that is passed into partition as value r then follow what happens inside partition.

The thing to remember is this is a sort routine in your partition header the list in array a may not be of the same type as the indexes .. p and r of your header are clearly indexes referring to an index position and a is the list to be sorted.

Thus your main start to a sort is