STL __merge_without_buffer 算法?

发布于 2024-07-11 21:04:23 字数 214 浏览 4 评论 0原文

在哪里可以获得 C++ STL 中 __merge_without_buffer() 中使用的算法的详细高级描述? 我正在尝试用 D 编程语言重新实现此代码,并进行一些增强。 仅通过阅读 STL 源代码,我似乎无法理解它在算法层面上所做的事情,因为有太多的低级细节掩盖了它。 另外,我不想盲目地翻译代码,因为那样的话,如果它不起作用,我将不知道为什么,而且我将无法添加我的增强功能。

Where can I get a decent high-level description of the algorithm used in __merge_without_buffer() in the C++ STL? I'm trying to reimplement this code in the D programming language, with some enhancements. I can't seem to grok what it's doing at the algorithmic level from just reading the STL source code because there are too many low-level details obscuring it. Also, I don't want to just blindly translate the code because then, if it didn't work I wouldn't know why, and I wouldn't be able to add my enhancements.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦巷 2024-07-18 21:04:23

__merge_without_buffer() 正在执行就地合并,作为就地合并排序的合并步骤。 它采用两个范围的数据 [first, middle)[middle, last) 作为输入,假定这些数据已经排序。 len1len2 参数等于两个输入范围的长度,即 (middle - first) 和 (last -中)分别。

首先,它选择一个枢轴元素。 然后,它将数据重新排列为 A1 B1 A2 B2 的顺序,其中 A1[first, middle) 中的元素集合,这些元素是小于主元,A2[first, middle) 中大于或等于主元的元素集合,B1 是集合[middle, last) 中小于主元的元素的数量,B2[middle, last) 中大于或的元素集合等于枢轴。 注意,数据原本的顺序是A1 A2 B1 B2,所以我们需要做的就是将A2 B1变成B1 A2。 这是通过调用 std::rotate() 来实现的。

现在我们已经将小于主元的元素(A1B1)与大于或等于主元的元素(A2< /code> 和 B2),所以现在我们可以递归地合并两个子范围 A1 A2B1 B2

我们如何选择枢轴? 在我正在查看的实现中,它从较大的子范围中选择中值元素(即,如果 [first, middle) 的元素多于 [middle, last),它选择[first, middle)的中位数; 否则,它选择[middle, last))的中位数。 由于子范围已经排序,因此选择中位数很简单。 这种主元选择确保了当递归合并两个子范围时,每个子问题不超过当前问题大小的 3/4,因为在最坏的情况下,至少有 1/4 的元素大于或小于主元。

这个的运行时间是多少? std::rotate() 调用需要 O(N) 时间,并且我们对自己进行了两次递归调用。 这相当于 O(N log N) 的运行时间。 但是,请注意,这只是合并排序中的一个步骤:请记住,在合并排序中,您首先对两半进行递归排序,然后合并。 因此,归并排序运行时间的递推关系现在为:

T(N) = 2T(N/2) + O(N log N)

将其插入 主定理,您会发现合并排序现在运行时间为 O(N log2 N) !

作为一个有趣的最后一点,请考虑基于比较的排序算法的以下三个品质:

  1. 在 O(N log N) 时间内就地
  2. 稳定
  3. 运行

您通常一次只能获得其中的 2 个 - 快速排序可以让您获得 (1) 和(3),归并排序得到 (2) 和 (3),而就地归并排序得到 (1) 和 (2)。 非基于比较的排序(例如计数排序)可以实现所有 3 个排序,但这些排序只能对某些数据类型进行排序。 可能存在一种基于比较的排序可以实现所有这 3 个目的,但如果存在,我不知道它的存在,而且几乎可以肯定它要复杂得多。

__merge_without_buffer() is performing an in-place merge, as the merge step of an in-place merge sort. It takes as input two ranges of data [first, middle) and [middle, last) which are assumed to already be sorted. The len1 and len2 parameters are equal to the lengths of the two input ranges, namely (middle - first) and (last - middle) respectively.

First, it picks a pivot element. Then, it rearranges the data into the order A1 B1 A2 B2, where A1 is the set of elements in [first, middle) that are less than the pivot, A2 is the set of elements in [first, middle) greater than or equal to the pivot, B1 is the set of elements in [middle, last) less than the pivot, and B2 is the set of elements in [middle, last) greater than or equal to the pivot. Note that the data is originally in the order A1 A2 B1 B2, so all we need to do is to turn A2 B1 into B1 A2. This is with a call to std::rotate(), which does just that.

Now we've separated out the elements which are less than the pivot (A1 and B1) from those that are greater than or equal to the pivot (A2 and B2), so now we can recursively merge the two subranges A1 A2 and B1 B2.

How do we choose a pivot? In the implementation I'm looking at, it chooses the median element from the larger subrange (i.e. if [first, middle) has more elements than [middle, last), it chooses the median of [first, middle); otherwise, it chooses the median of [middle, last)). Since the subranges are already sorted, choosing the median is trivial. This pivot choice ensures that when recursively merging the two subranges, each subproblem is no more than 3/4 the size of the current problem, because in the worst case, at least 1/4 of the elements are larger than or smaller than the pivot.

What is the running time of this? The std::rotate() call takes O(N) time, and we make two recursive calls to ourselves. This equates to a running time of O(N log N). However, note that this is only one step in mergesort: remember that in mergesort you first recursively sort both halves and then merge. Thus, the recurrence relation for the running time of mergesort is now:

T(N) = 2T(N/2) + O(N log N)

Plug this into the Master theorem, and you get that mergesort now runs in O(N log2 N) time!

As an interesting final point, consider the following three qualities of a comparison-based sorting algorithm:

  1. In-place
  2. Stable
  3. Runs in O(N log N) time

You can usually only get 2 of these at once - quicksort gets you (1) and (3), mergesort gets you (2) and (3), and an in-place mergesort gets you (1) and (2). Non-comparison-based sorts such as count sort can achieve all 3, but those sorts can only sort certain data types. It's possible there exists a comparison-based sort which achieves all 3, but if there is, I'm not aware of its existence, and it's almost certainly much more complicated.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文