为什么就地合并排序不稳定？

发布于 2024-08-15 08:47:30 字数 1649 浏览 17 评论 0原文

下面的实现是稳定的，因为它在标记 XXX 的行使用 <= 而不是 <。这也使其更加高效。是否有任何理由在这一行使用 < 而不是 <= ？

/**
class for In place MergeSort
**/
class MergeSortAlgorithm extends SortAlgorithm {
    void sort(int a[], int lo0, int hi0) throws Exception {
    int lo = lo0;
    int hi = hi0;
    pause(lo, hi);
    if (lo >= hi) {
        return;
    }
    int mid = (lo + hi) / 2;

        /*
         *  Partition the list into two lists and sort them recursively
         */
        sort(a, lo, mid);
        sort(a, mid + 1, hi);

        /*
         *  Merge the two sorted lists
         */
    int end_lo = mid;
        int start_hi = mid + 1;
    while ((lo <= end_lo) && (start_hi <= hi)) {
            pause(lo);
        if (stopRequested) {
                return;
            }
            if (a[lo] <= a[start_hi]) {                   // LINE XXX
                lo++;
            } else {
                /*  
                 *  a[lo] >= a[start_hi]
                 *  The next element comes from the second list, 
                 *  move the a[start_hi] element into the next 
                 *  position and shuffle all the other elements up.
                 */
        int T = a[start_hi];
                for (int k = start_hi - 1; k >= lo; k--) {
                    a[k+1] = a[k];
                    pause(lo);
                }
                a[lo] = T;
                lo++;
                end_lo++;
                start_hi++;
            }
        }
    }

    void sort(int a[])  throws Exception {
    sort(a, 0, a.length-1);
    }
}

原文

The implementation below is stable as it used <= instead of < at line marked XXX. This also makes it more efficient. Is there any reason to use < and not <= at this line?

/**
class for In place MergeSort
**/
class MergeSortAlgorithm extends SortAlgorithm {
    void sort(int a[], int lo0, int hi0) throws Exception {
    int lo = lo0;
    int hi = hi0;
    pause(lo, hi);
    if (lo >= hi) {
        return;
    }
    int mid = (lo + hi) / 2;

        /*
         *  Partition the list into two lists and sort them recursively
         */
        sort(a, lo, mid);
        sort(a, mid + 1, hi);

        /*
         *  Merge the two sorted lists
         */
    int end_lo = mid;
        int start_hi = mid + 1;
    while ((lo <= end_lo) && (start_hi <= hi)) {
            pause(lo);
        if (stopRequested) {
                return;
            }
            if (a[lo] <= a[start_hi]) {                   // LINE XXX
                lo++;
            } else {
                /*  
                 *  a[lo] >= a[start_hi]
                 *  The next element comes from the second list, 
                 *  move the a[start_hi] element into the next 
                 *  position and shuffle all the other elements up.
                 */
        int T = a[start_hi];
                for (int k = start_hi - 1; k >= lo; k--) {
                    a[k+1] = a[k];
                    pause(lo);
                }
                a[lo] = T;
                lo++;
                end_lo++;
                start_hi++;
            }
        }
    }

    void sort(int a[])  throws Exception {
    sort(a, 0, a.length-1);
    }
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

月下伊人醉 2024-08-22 08:47:30

因为代码中的 <= 确保相同值的元素（在排序数组的左半部分和右半部分中）不会被交换。
而且，它还避免了无用的交换。

if (a[lo] <= a[start_hi]) {
 /* The left value is smaller than or equal to the right one, leave them as is. */
 /* Especially, if the values are same, they won't be exchanged. */
 lo++;
} else {
 /*
  * If the value in right-half is greater than that in left-half,
  * insert the right one into just before the left one, i.e., they're exchanged.
  */
 ...
}

假设两半中都有相同值的元素（例如“5”）并且上面的运算符是<。
正如上面的注释所示，右侧的“5”将被插入到左侧的“5”之前，换句话说，将交换相同值的元素。
这意味着排序不稳定。
而且，交换相同值的元素效率很低。

我猜效率低下的原因来自于算法本身。
您的合并阶段是使用插入排序实现的（如您所知，它是 O(n^2)）。

当你对巨大的数组进行排序时，你可能必须重新实现。

Because the <= in your code assures that same-valued elements (in left- and right-half of sorting array) won't be exchanged.
And also, it avoids useless exchanges.

if (a[lo] <= a[start_hi]) {
 /* The left value is smaller than or equal to the right one, leave them as is. */
 /* Especially, if the values are same, they won't be exchanged. */
 lo++;
} else {
 /*
  * If the value in right-half is greater than that in left-half,
  * insert the right one into just before the left one, i.e., they're exchanged.
  */
 ...
}

Assume that same-valued element (e.g., ‘5’) in both-halves and the operator above is <.
As comments above shows, the right ‘5’ will be inserted before the left ‘5’, in other words, same-valued elements will be exchanged.
This means the sort is not stable.
And also, it's inefficient to exchange same-valued elements.

I guess the cause of inefficiency comes from the algorithm itself.
Your merging stage is implemented using insertion sort (as you know, it's O(n^2)).

You may have to re-implement when you sort huge arrays.

回复收藏 0 原文