如何迭代地编写归并排序?

发布于 2024-12-26 21:58:39 字数 666 浏览 1 评论 0原文

我写了一个递归版本的合并排序。它使用了一个单独的合并例程:

def merge(lst1, lst2):
    i = j = 0
    merged = []
    while i < len(lst1) and j < len(lst2):
        if lst1[i] <= lst2[j]:
            merged.append(lst1[i])
            i += 1
        else:
            merged.append(lst2[j])
            j += 1
    merged.extend(lst1[i:])
    merged.extend(lst2[j:])
    return merged

def merge_sort(lst):
    if len(lst) < 2:
        return lst
    else:
        middle = len(lst) / 2
        return merge(merge_sort(lst[:middle]), merge_sort(lst[middle:]))

为了节省堆栈空间(以及为了学习算法的乐趣/纯粹的乐趣),我尝试以迭代方式编写这个函数。然而,我发现这很困难,因为我不确定最终如何组合不同的列表。

谢谢你!

I have written a recursive version of merge sort. It makes use of a separate merge routine:

def merge(lst1, lst2):
    i = j = 0
    merged = []
    while i < len(lst1) and j < len(lst2):
        if lst1[i] <= lst2[j]:
            merged.append(lst1[i])
            i += 1
        else:
            merged.append(lst2[j])
            j += 1
    merged.extend(lst1[i:])
    merged.extend(lst2[j:])
    return merged

def merge_sort(lst):
    if len(lst) < 2:
        return lst
    else:
        middle = len(lst) / 2
        return merge(merge_sort(lst[:middle]), merge_sort(lst[middle:]))

To conserve stack space (and for kicks/the sheer joy of learning algorithms), I am trying to write this function in an iterative manner. However, I find this difficult because I am not sure how to combine disparate lists in the very end.

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

意中人 2025-01-02 21:58:39

您将需要一个将被重复调用的 merge 函数(相同或几乎相同的 merge 函数)。因此,您不需要更改 merge 函数。

这是一个多遍解决方案。从块大小 2 开始,并在每次传递中将块大小加倍。

在每次传递中,将列表划分为大小不重叠的块。将每个块分成 2 部分,并对这 2 部分调用 merge

这是一个自下而上的版本。

You will need a merge function (the same or almost same merge function) which will be called repeatedly. So, you don't need to change the merge function.

This is a multiple pass solution. Start with a chunk size of 2 and double the chunk size in every pass.

In every pass, partition the list into non-overlapping chunks of size whatever. Split every chunk into 2 and call merge on the 2 parts.

This is a bottom up version.

暖伴 2025-01-02 21:58:39

我根据Divya的描述进行了扩展(还添加了测试功能以进行验证)。下面的代码可以通过消除子数组(data_1 和 data_2)并就地排序来优化。

def merge_sort_iterative(data):
  """ gets the data using merge sort and returns sorted."""

  for j in range(1, len(data)):
    j *= 2
    for i in range(0,len(data),j):
      data_1 = data[i:i+(j/2)]
      data_2 = data[i+(j/2):j-i]
      l = m = 0
      while l < len(data_1) and m < len(data_2):
        if data_1[l] < data_2[m]:
          m += 1
        elif data_1[l] > data_2[m]:
          data_1[l], data_2[m] = data_2[m], data_1[l]
          l += 1
      data[i:i+(j/2)], data[i+(j/2):j-i] = data_1, data_2

  return data

def test_merge_sort():
  """test function for verifying algorithm correctness"""

  import random
  import time

  sample_size = 5000
  sample_data = random.sample(range(sample_size*5), sample_size)
  print 'Sample size: ', sample_size

  begin = time.time()
  sample_data = [5,4,3,2,1]
  result = merge_sort_iterative(sample_data)
  end = time.time()
  expected = sorted(sample_data)
  print 'Sorting time: %f \'secs'%(end-begin)

  assert result == expected, 'Algorithm failed'
  print 'Algorithm correct'

if __name__ == '__main__':
  test_merge_sort()

I expanded based on Divya's description (added a test function for verification as well). The below code may be optimized by eliminating sub-arrays(data_1 and data_2) and sorting in place.

def merge_sort_iterative(data):
  """ gets the data using merge sort and returns sorted."""

  for j in range(1, len(data)):
    j *= 2
    for i in range(0,len(data),j):
      data_1 = data[i:i+(j/2)]
      data_2 = data[i+(j/2):j-i]
      l = m = 0
      while l < len(data_1) and m < len(data_2):
        if data_1[l] < data_2[m]:
          m += 1
        elif data_1[l] > data_2[m]:
          data_1[l], data_2[m] = data_2[m], data_1[l]
          l += 1
      data[i:i+(j/2)], data[i+(j/2):j-i] = data_1, data_2

  return data

def test_merge_sort():
  """test function for verifying algorithm correctness"""

  import random
  import time

  sample_size = 5000
  sample_data = random.sample(range(sample_size*5), sample_size)
  print 'Sample size: ', sample_size

  begin = time.time()
  sample_data = [5,4,3,2,1]
  result = merge_sort_iterative(sample_data)
  end = time.time()
  expected = sorted(sample_data)
  print 'Sorting time: %f \'secs'%(end-begin)

  assert result == expected, 'Algorithm failed'
  print 'Algorithm correct'

if __name__ == '__main__':
  test_merge_sort()
柠檬色的秋千 2025-01-02 21:58:39

这是Java实现

public static <T extends Comparable<? super T>> void iterativeMergeSort(T[] seed) {

    for (int i = 1; i <seed.length; i=i+i)
    {
        for (int j = 0; j < seed.length - i; j = j + i+i)
        {
            inPlaceMerge(seed, j, j + i-1, Math.min(j+i + i -1, seed.length -1));
        }
    }       
}

public static <T extends Comparable<? super T>>  void inPlaceMerge(T[] collection, int low, int mid, int high) {
    int left = low;
    int right = mid + 1;

    if(collection[mid].equals(collection[right])) {
        return ;//Skip the merge if required
    }
    while (left <= mid && right <= high) {          
        // Select from left:  no change, just advance left
        if (collection[left].compareTo(collection[right]) <= 0) {
            left ++;
        } else { // Select from right:  rotate [left..right] and correct
            T tmp = collection[right]; // Will move to [left]
            rotateRight(collection, left, right - left);
            collection[left] = tmp;
            // EVERYTHING has moved up by one
            left ++; right ++; mid ++;
        }
    }       
}

这是单元测试

private Integer[] seed;

@Before
public void doBeforeEachTestCase() {
    this.seed = new Integer[]{4,2,3,1,5,8,7,6};
}
@Test
public void iterativeMergeSortFirstTest() {
    ArrayUtils.<Integer>iterativeMergeSort(seed);
    Integer[] result = new Integer[]{1,2,3,4,5,6,7,8};
    assertThat(seed, equalTo(result));  
}

Here is Java Implementation

public static <T extends Comparable<? super T>> void iterativeMergeSort(T[] seed) {

    for (int i = 1; i <seed.length; i=i+i)
    {
        for (int j = 0; j < seed.length - i; j = j + i+i)
        {
            inPlaceMerge(seed, j, j + i-1, Math.min(j+i + i -1, seed.length -1));
        }
    }       
}

public static <T extends Comparable<? super T>>  void inPlaceMerge(T[] collection, int low, int mid, int high) {
    int left = low;
    int right = mid + 1;

    if(collection[mid].equals(collection[right])) {
        return ;//Skip the merge if required
    }
    while (left <= mid && right <= high) {          
        // Select from left:  no change, just advance left
        if (collection[left].compareTo(collection[right]) <= 0) {
            left ++;
        } else { // Select from right:  rotate [left..right] and correct
            T tmp = collection[right]; // Will move to [left]
            rotateRight(collection, left, right - left);
            collection[left] = tmp;
            // EVERYTHING has moved up by one
            left ++; right ++; mid ++;
        }
    }       
}

Here is the unit test

private Integer[] seed;

@Before
public void doBeforeEachTestCase() {
    this.seed = new Integer[]{4,2,3,1,5,8,7,6};
}
@Test
public void iterativeMergeSortFirstTest() {
    ArrayUtils.<Integer>iterativeMergeSort(seed);
    Integer[] result = new Integer[]{1,2,3,4,5,6,7,8};
    assertThat(seed, equalTo(result));  
}
三月梨花 2025-01-02 21:58:39

递归更直观,因此我更喜欢相同的方法,除非在某些情况下我想避免显着的堆栈深度(例如,在消耗某些协同例程实现时)。然而,在合并排序的情况下,迭代版本实际上更容易遵循(至少是伪代码)。

所需要的只是一个嵌套循环,其中内部循环对 2^k 个元素对执行合并,外部循环负责递增 k。

所需的附加步骤是将任何未配对的组与先前合并的组合并。如果元素的数量不是 2 的幂,则会遇到不成对的组。不成对的组将始终位于迭代的末尾。

例如
[5,7,3,4,1,9]→ [5, 7] [3, 4] [1, 9] → [3,4,5,7][1,9]→ [1, 3, 4, 5, 7, 9]

在上面的示例中,[1, 9] 是一个最初没有要合并的其他组的组。因此它与前一组合并(已经合并并排序了)

这是一个 python 实现:

from MergeSort import merge

def sort(arr):
    n = len(arr) - 1
    c = 1
    start = 0
    mid = 0
    end = 0
    while c <= n:
        while end < n:
            mid = start + c//2
            end = start + c
            if (start < n) and (end <= n):
                merge(arr, start, mid, end)
                start = end + 1
            else:
                merge(arr, start - c - 1, start-1, n)
        c = 2*c + 1
        start = 0
        mid = 0
        end = 0

我使用了常规(递归)版本中的合并函数。虽然上面的代码不是最优雅的,但它可以工作并且具有与递归版本相同的复杂性。 (我没有彻底检查过,但乍一看似乎是这样)

这是一个单元测试:

def test_merge_sort_iterative(self):
    for i in range(1, 100):
        length = randint(10, 5000)
        data = [randint(1, 10000) for x in range(1, length)]
        IterativeMergeSort.sort(data)
        issorted = True
        i = 0
        while (i < len(data) - 1) & issorted:
            if data[i] > data[i + 1]:
                issorted = False
            i += 1
    self.assertTrue(issorted, data)
    return

Recursion is more intuitive hence I prefer the same except in some situations when I want to avoid a significant stack depth (e.g while consuming certain co-routine implementations). In case of Merge sort however the iterative version is actually easier to follow (at least the pseudo code).

All that's needed is a nested loop with the inner loop performing merges on pairs of 2^k elements with the outer loop responsible for incrementing k.

An additional step that is required is to merge any unpaired group with the previous merged group. An unpaired group will be encountered if the number of elements is not a power of 2. An unpaired group will always be at the end of the iteration.

e.g.
[5, 7, 3, 4, 1, 9] -> [5, 7] [3, 4] [1, 9] -> [3, 4, 5, 7] [1, 9] -> [1, 3, 4, 5, 7, 9]

In the above example [1, 9] is a group that did not have another group to be merged with initially. Thus it was merged with the previous group (which had been merged and sorted already)

Here is a python implementation:

from MergeSort import merge

def sort(arr):
    n = len(arr) - 1
    c = 1
    start = 0
    mid = 0
    end = 0
    while c <= n:
        while end < n:
            mid = start + c//2
            end = start + c
            if (start < n) and (end <= n):
                merge(arr, start, mid, end)
                start = end + 1
            else:
                merge(arr, start - c - 1, start-1, n)
        c = 2*c + 1
        start = 0
        mid = 0
        end = 0

I used the merge function from the regular (recursive) version. While the above code is not the most elegant, but it works and has the same complexity as the recursive version. (I have not checked thoroughly but it seems so to me from a quick glance)

Here is a unit test:

def test_merge_sort_iterative(self):
    for i in range(1, 100):
        length = randint(10, 5000)
        data = [randint(1, 10000) for x in range(1, length)]
        IterativeMergeSort.sort(data)
        issorted = True
        i = 0
        while (i < len(data) - 1) & issorted:
            if data[i] > data[i + 1]:
                issorted = False
            i += 1
    self.assertTrue(issorted, data)
    return
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文