为什么 C 快速排序函数(磁带比较、磁带交换)比冒泡排序函数慢得多?

发布于 2024-10-16 07:36:59 字数 4746 浏览 11 评论 0原文

我将为学生实现一个玩具磁带“大型机”,展示“快速排序”类函数的速度(递归与否,并不重要,因为硬件速度较慢,以及众所周知的堆栈反转技术) “冒泡排序”函数类。所以,虽然我很清楚硬件实现和控制器,但我猜测快速排序功能在序列、顺序和比较距离方面比其他功能快得多(从中间倒带比从最开始倒带要快得多)结束,因为倒带速度不同)。

不幸的是,事实并非如此。与“快速排序”函数相比,这个简单的“冒泡”代码在比较距离、方向以及比较和写入次数方面显示出巨大的改进。

所以我有3个问题:

  1. 我在实现快速排序功能时是否有错误?
  2. 我在实现 bubblesoft 功能时是否有错误?
  3. 如果不是,为什么“冒泡排序”函数(比较和写入操作)比“快速排序”函数快得多?

我已经有一个“快速排序”函数:

void quicksort(float *a, long l, long r, const compare_function& compare)
{
    long i=l, j=r, temp, m=(l+r)/2;
    if (l == r) return;
    if (l == r-1)
    {
        if (compare(a, l, r))
        {
            swap(a, l, r);
        }
        return;
    }
    if (l < r-1)
    {
        while (1)
        {
            i = l;
            j = r;
            while (i < m && !compare(a, i, m)) i++;
            while (m < j && !compare(a, m, j)) j--;
            if (i >= j)
            {
                break;
            }
            swap(a, i, j);
        }
        if (l < m) quicksort(a, l, m, compare);
        if (m < r) quicksort(a, m, r, compare);
        return;
    }
}

并且我有自己的“冒泡排序”函数实现:

void bubblesort(float *a, long l, long r, const compare_function& compare)
{
    long i, j, k;
    if (l == r)
    {
        return;
    }
    if (l == r-1)
    {
        if (compare(a, l, r))
        {
            swap(a, l, r);
        }
        return;
    }
    if (l < r-1)
    {
        while(l < r)
        {
            i = l;
            j = l;
            while (i < r)
            {
                i++;
                if (!compare(a, j, i))
                {
                    continue;
                }
                j = i;
            }
            if (l < j)
            {
                swap(a, l, j);
            }
            l++;
            i = r;
            k = r;
            while(l < i)
            {
                i--;
                if (!compare(a, i, k))
                {
                    continue;
                }
                k = i;
            }
            if (k < r)
            {
                swap(a, k, r);
            }
            r--;
        }
        return;
    }
}

我在测试示例代码中使用了这些排序函数,如下所示:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <conio.h>

long swap_count;
long compare_count;

typedef long (*compare_function)(float *, long, long );
typedef void (*sort_function)(float *, long , long , const compare_function& );

void init(float *, long );
void print(float *, long );

void sort(float *, long, const sort_function& );
void swap(float *a, long l, long r);

long less(float *a, long l, long r);
long greater(float *a, long l, long r);

void bubblesort(float *, long , long , const compare_function& );
void quicksort(float *, long , long , const compare_function& );

void main()
{
    int n;
    printf("n=");

    scanf("%d",&n);
    printf("\r\n");

    long i;
    float *a = (float *)malloc(n*n*sizeof(float));

    sort(a, n, &bubblesort);
    print(a, n);

    sort(a, n, &quicksort);
    print(a, n);

    free(a);
}

long less(float *a, long l, long r)
{
    compare_count++;
    return *(a+l) < *(a+r) ? 1 : 0;
}

long greater(float *a, long l, long r)
{
    compare_count++;
    return *(a+l) > *(a+r) ? 1 : 0;
}

void swap(float *a, long l, long r)
{
    swap_count++;

    float temp;

    temp = *(a+l);
    *(a+l) = *(a+r);
    *(a+r) = temp;
}

float tg(float x)
{
    return tan(x);
}

float ctg(float x)
{
    return 1.0/tan(x);
}

void init(float *m,long n)
{
    long i,j;
    for (i = 0; i < n; i++)
    {
        for (j=0; j< n; j++)
        {
            m[i + j*n] = tg(0.2*(i+1)) + ctg(0.3*(j+1));
        }
    }
}

void print(float *m, long n)
{
    long i, j;
    for(i = 0; i < n; i++)
    {
        for(j = 0; j < n; j++)
        {
            printf("  %5.1f", m[i + j*n]);
        }
        printf("\r\n");
    }
    printf("\r\n");
}

void sort(float *a, long n, const sort_function& sort)
{
    long i, sort_compare = 0, sort_swap = 0;

    init(a,n);

    for(i = 0; i < n*n; i+=n)
    {
        if (fmod (i / n, 2) == 0)
        {
            compare_count = 0;

            swap_count = 0;
            sort(a, i, i+n-1, &less);

            if (swap_count == 0)
            {
                compare_count = 0;
                sort(a, i, i+n-1, &greater);
            }

            sort_compare += compare_count;
            sort_swap += swap_count;
        }
    }

    printf("compare=%ld\r\n", sort_compare);
    printf("swap=%ld\r\n", sort_swap);

    printf("\r\n");
}

I'm going to implement a toy tape "mainframe" for a students, showing the quickness of "quicksort" class functions (recursive or not, does not really matter, due to the slow hardware, and well known stack reversal techniques) compared to the "bubblesort" function class. So, while I'm clear about the hardware implementation and controllers, I guessed that quicksort function is much faster than other ones in terms of sequence, order and comparison distance (it is much faster to rewind the tape from the middle than from the very end, because of different speed of rewind).

Unfortunately, this is not true; this simple "bubble" code shows great improvements compared to the "quicksort" functions in terms of comparison distances, direction and number of comparisons and writes.

So I have 3 questions:

  1. Does I have a mistake in my implememtation of quicksort function?
  2. Does I have a mistake in my implememtation of bubblesoft function?
  3. If not, why is the "bubblesort" function so much faster in (comparison and write operations) than "quicksort" function?

I already have a "quicksort" function:

void quicksort(float *a, long l, long r, const compare_function& compare)
{
    long i=l, j=r, temp, m=(l+r)/2;
    if (l == r) return;
    if (l == r-1)
    {
        if (compare(a, l, r))
        {
            swap(a, l, r);
        }
        return;
    }
    if (l < r-1)
    {
        while (1)
        {
            i = l;
            j = r;
            while (i < m && !compare(a, i, m)) i++;
            while (m < j && !compare(a, m, j)) j--;
            if (i >= j)
            {
                break;
            }
            swap(a, i, j);
        }
        if (l < m) quicksort(a, l, m, compare);
        if (m < r) quicksort(a, m, r, compare);
        return;
    }
}

and I have my own implementation of the "bubblesort" function:

void bubblesort(float *a, long l, long r, const compare_function& compare)
{
    long i, j, k;
    if (l == r)
    {
        return;
    }
    if (l == r-1)
    {
        if (compare(a, l, r))
        {
            swap(a, l, r);
        }
        return;
    }
    if (l < r-1)
    {
        while(l < r)
        {
            i = l;
            j = l;
            while (i < r)
            {
                i++;
                if (!compare(a, j, i))
                {
                    continue;
                }
                j = i;
            }
            if (l < j)
            {
                swap(a, l, j);
            }
            l++;
            i = r;
            k = r;
            while(l < i)
            {
                i--;
                if (!compare(a, i, k))
                {
                    continue;
                }
                k = i;
            }
            if (k < r)
            {
                swap(a, k, r);
            }
            r--;
        }
        return;
    }
}

I have used these sort functions in a test sample code, like this:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <conio.h>

long swap_count;
long compare_count;

typedef long (*compare_function)(float *, long, long );
typedef void (*sort_function)(float *, long , long , const compare_function& );

void init(float *, long );
void print(float *, long );

void sort(float *, long, const sort_function& );
void swap(float *a, long l, long r);

long less(float *a, long l, long r);
long greater(float *a, long l, long r);

void bubblesort(float *, long , long , const compare_function& );
void quicksort(float *, long , long , const compare_function& );

void main()
{
    int n;
    printf("n=");

    scanf("%d",&n);
    printf("\r\n");

    long i;
    float *a = (float *)malloc(n*n*sizeof(float));

    sort(a, n, &bubblesort);
    print(a, n);

    sort(a, n, &quicksort);
    print(a, n);

    free(a);
}

long less(float *a, long l, long r)
{
    compare_count++;
    return *(a+l) < *(a+r) ? 1 : 0;
}

long greater(float *a, long l, long r)
{
    compare_count++;
    return *(a+l) > *(a+r) ? 1 : 0;
}

void swap(float *a, long l, long r)
{
    swap_count++;

    float temp;

    temp = *(a+l);
    *(a+l) = *(a+r);
    *(a+r) = temp;
}

float tg(float x)
{
    return tan(x);
}

float ctg(float x)
{
    return 1.0/tan(x);
}

void init(float *m,long n)
{
    long i,j;
    for (i = 0; i < n; i++)
    {
        for (j=0; j< n; j++)
        {
            m[i + j*n] = tg(0.2*(i+1)) + ctg(0.3*(j+1));
        }
    }
}

void print(float *m, long n)
{
    long i, j;
    for(i = 0; i < n; i++)
    {
        for(j = 0; j < n; j++)
        {
            printf("  %5.1f", m[i + j*n]);
        }
        printf("\r\n");
    }
    printf("\r\n");
}

void sort(float *a, long n, const sort_function& sort)
{
    long i, sort_compare = 0, sort_swap = 0;

    init(a,n);

    for(i = 0; i < n*n; i+=n)
    {
        if (fmod (i / n, 2) == 0)
        {
            compare_count = 0;

            swap_count = 0;
            sort(a, i, i+n-1, &less);

            if (swap_count == 0)
            {
                compare_count = 0;
                sort(a, i, i+n-1, &greater);
            }

            sort_compare += compare_count;
            sort_swap += swap_count;
        }
    }

    printf("compare=%ld\r\n", sort_compare);
    printf("swap=%ld\r\n", sort_swap);

    printf("\r\n");
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

笛声青案梦长安 2024-10-23 07:37:00

我认为问题在于大多数快速排序实现都依赖于分区步骤,该分区步骤在要排序的区域的两端交替进行读取和写入。在随机访问模型中,这完全没问题(所有读取本质上都是 O(1)),但在磁带上,这可能非常昂贵,因为在要排序的范围的不同端之间来回交换可能需要 O( n) 卷带一直向前和向后滚动的时间。这将通常的 O(n) 分区步骤变成了可能的 O(n2),从而控制了函数的运行时间。此外,由于执行磁带查找所需的时间可能比处理器频率慢数千或数百万倍,因此这种 O(n2) 工作具有巨大的常数因子。

另一方面,冒泡排序不存在这个问题,因为它总是比较数组中的相邻单元格。它最多使数组通过 O(n) 次,因此只需将磁带倒带 n 次。冒泡排序的处理逻辑肯定更昂贵 - 比几乎任何其他 O(n2) 排序都要昂贵 - 但这与不来回查找磁带所节省的时间相比根本不算什么。

简而言之,快速排序在磁带上的运行速度可能比冒泡排序慢得多,因为它需要磁带在执行过程中移动更多。由于磁带查找的成本很高,因此快速排序的天然运行时间优势将在这一步中消失,而冒泡排序应该看起来更快。

I think that the problem is that most quicksort implementations rely on a partition step that alternates reads and writes on opposite ends of the region to be sorted. In a random-access model this is perfectly fine (all reads are essentially O(1)), but on a tape this can be extremely expensive, since swapping back and forth between the different ends of the range to be sorted can take O(n) time as the tape reel rolls all the way forwards and backwards. This turns what normally is an O(n) partitioning step into something that's potentially O(n2), dominating the runtime of the function. Moreover, since the time required to do a tape seek is probably thousands or millions of times slower than the processor's frequency, this O(n2) work has a huge constant factor.

Bubble sort, on the other hand, doesn't have this problem because it always compares adjacent cells in the array. It makes at most O(n) passes over the array, and thus requires the tape to be rewound only n times. The processing logic is definitely more expensive in bubble sort - more so than almost any other O(n2) sort - but this is nothing compared to the time saved not seeking the tape back and forth.

In short, quicksort should probably run much slower on a tape than bubble sort simply because it requires the tape to move a lot more during execution. Since tape seeking is expensive, the natural runtime advantage of quicksort will get eaten up in this step, and bubblesort should look much faster.

难理解 2024-10-23 07:37:00

templatetypedef的答案 是正确的。冒泡排序的访问不仅最小化分布,而且还就地运行。我怀疑对于具有单个、任意慢的磁带且只有 O(1) RAM 的机器来说,这实际上是最佳排序算法。 [编辑:事实上鸡尾酒排序(冒泡排序的双向版本)应该更好因为它避免了浪费的倒带 - 感谢 Steve Jessop。]

如果您有 4 个可用的磁带驱动器,则 归并排序占据主导地位。只需 3 个磁带,就可以使用更高级版本的合并排序

templatetypedef's answer is right on the money. Not only are bubblesort's accesses minimally spread out, it operates in-place. I suspect it is actually the best sorting algorithm for a machine having a single, arbitrarily slow tape and only O(1) RAM. [EDIT: In fact cocktail sort (a bidirectional version of bubblesort) should be better as it avoids wasteful rewinds -- thanks Steve Jessop.]

If you have 4 tape drives available then mergesort rules the roost. With only 3 tapes, a fancier version of mergesort can be used.

愿得七秒忆 2024-10-23 07:37:00

快速排序比冒泡排序更快的原因之一是它可以瞬间将元素移动很远的距离。如果快速排序将一个元素向上移动 50 个元素,然后向下移动 20 个、向上 10 个、向上 5 个、向下 2 个,最后到达正确的位置,则该元素最终将比起始位置移动 43 个位置,而只移动了 5 次。冒泡排序会移动元素 43 次。如果移动元素一个位置的成本与移动元素 50 个位置的成本相同,那就是重大胜利。然而,如果移动元素的成本与距离成正比,则快速排序会将元素移动的总距离为 87 个槽,是冒泡排序的两倍。

如果人们在处理磁带驱动器时遇到困难,最佳算法将在很大程度上取决于驱动器的物理工作方式。例如,在某些驱动器上,唯一的操作是倒带并准备写入(在此过程中有效地擦除磁带),倒带并准备读取,以及处理下一个字节(读取或写入,取决于倒带模式)。其他驱动器允许在磁带上的任何位置随机访问和替换单个块。某些驱动器仅限于沿一个方向读取。其他磁带(例如 QIC 磁带)具有一些沿一个方向读取的磁道和一些沿另一个方向读取的磁道。我不知道是否有任何驱动器允许在两个方向上读取或写入同一数据块,但这样的事情至少在理论上是可能的。

One of the reasons QuickSort is faster than bubble sort is that it instantaneously move elements great distances. If QuickSort moves an element up 50 elements, then down 20, up 10, up 5, and down 2 before it ends up in its proper spot, the element will end up 43 slots from where it started, while having only moved 5 times. Bubble sort would have moved the element 43 times. If moving the element one slot costs the same as moving it by 50, that's a major win. If, however, the cost of moving an element is proportional to the distance, QuickSort will have moved the element a total distance of 87 slots--twice as many as bubble sort.

If one is stuck dealing with tape drives, the optimal algorithm will depend a lot upon how the drives physically work. For example, on some drives the only operations are rewind and prepare for writing (effectively erasing the tape in the process), rewind and prepare for reading, and process the next byte (read or write, depending upon the rewind mode). Other drives allow individual blocks to be randomly accessed and replaced anywhere on the tape. Some drives are limited to reading in one direction. Others (e.g. QIC tapes) have some tracks which read in one direction and some which read in the other direction. I don't know if any drives would allow the same block of data to be read or written in both directions, but such a thing would be at least theoretically possible.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文