为什么奇偶分裂“更快”?用于归并排序?
MergeSort 是一种分而治之的算法,它将输入分为几个部分并递归地求解这些部分。
...split 函数有多种方法。一种方法是从中间分开。这种方法有一些很好的特性,但是,我们将重点关注一种更快的方法:奇偶分割。这个想法是将每个偶数位置的元素放在一个列表中,将每个奇数位置的元素放在另一个列表中。
这直接来自我的讲义。究竟为什么偶数奇数分割比数组中间的分割要快?
我推测这与传递到 MergeSort 的列表以及已经排序的质量有关,但我不完全确定。
有人能解释一下吗?
编辑:我尝试在Python中运行以下命令...
global K
K = []
for i in range (1, 100000):
K.append(i)
def testMergeSort():
"""
testMergeSort shows the proper functionality for the
Merge Sort Algorithm implemented above.
"""
t = Timer("mergeSort([K])", "from __main__ import *")
print(t.timeit(1000000))
p = Timer("mergeSort2([K])", "from __main__ import *")
print(p.timeit(1000000))
(MergeSort是奇偶合并排序,MergeSort2向下划分中心)
结果是:
<块引用> <块引用>0.771506746608
0.843161219237
MergeSort is a divide-and-conquer algorithm that divides the input into several parts and solves the parts recursively.
...There are several approaches for the split function. One way is to split down the middle. That approach has some nice properties, however, we'll focus on a method that's a little bit faster: even-odd split. The idea is to put every even-position element in one list, and every odd-position in another.
This is straight from my lecture notes. Why exactly is it the case that the even-odd split is faster than down the middle of the array?
I'm speculating it has something to do with the list being passed into MergeSort and having the quality of already already sorted, but I'm not entirely sure.
Could anyone shed some light on this?
Edit: I tried running the following in Python...
global K
K = []
for i in range (1, 100000):
K.append(i)
def testMergeSort():
"""
testMergeSort shows the proper functionality for the
Merge Sort Algorithm implemented above.
"""
t = Timer("mergeSort([K])", "from __main__ import *")
print(t.timeit(1000000))
p = Timer("mergeSort2([K])", "from __main__ import *")
print(p.timeit(1000000))
(MergeSort is the even-odd MergeSort, MergeSort2 divides down the center)
And the result was:
0.771506746608
0.843161219237
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我可以看到,它可能会更好,因为将其与替代元素分开意味着您不必知道输入从多长时间开始 - 您只需取出元素并将它们放入交替列表中,直到用完为止。
此外,如果您小心地允许更好的并行处理,您可能会在完成第一个列表的迭代之前开始分割结果列表。
我应该补充一点,我不是这些问题的专家,它们只是想到的事情......
I can see that it could be possible that it is better because splitting it with alternative elements means you don't have to know how long the input is to start with - you just take elements and put them in alternating lists until you run out.
Also you could potentially starting splitting the resulting lists before you have finished iterating through the first list if you are careful allowing for better parallel processing.
I should add that I'm no expert on these matters, they are just things that came to mind...
输入列表越接近已排序,运行时就越短(这是因为如果所有内容都已排序,则合并过程不必移动任何值)以正确的顺序;它只执行 O(n) 次比较,因为 MergeSort 在拆分的每一半上递归地调用自身,因此需要选择一个
split< /code> 函数,增加了列表的结果一半按排序顺序的可能性如果列表大部分已排序,偶数-奇数分割将比中间分割做得更好。例如,
如果我们从中间拆分(请注意,两个子列表都有排序错误),则结果是
,而如果我们进行奇数拆分,我们将得到
两个已排序的列表(以及最佳情况性能) 。但是,在不了解有关输入列表的任何信息的情况下,拆分函数的选择不应渐进地影响运行时。
The closer the input list is to already being sorted, the lower the runtime (this is because the
merge
procedure doesn't have tomove
any of the values if everything is already in the correct order; it just performs O(n) comparisons. Since MergeSort recursively calls itself on each half of the split, one wants to choose asplit
function that increases the likelihood that the resulting halves of the list are in sorted order. If the list is mostly sorted, even-odd split will do a better job of this than splitting down the middle. For example,would result in
if we split down the middle (note that both sub-lists have sorting errors), whereas if we did even-odd split we would get
which results in two already-sorted lists (and best-case performance for the subsequent calls to
MergeSort
). Without knowing anything about the input lists, though, the choice of splitting function shouldn't affect runtime asymptotically.我怀疑你的实验中有噪音。 :) 其中一些可能来自比较和交换,实际上没有移动列表中的任何元素,以避免缓存失效等。
无论如何,这里有一个关于此的聊天:https://cstheory.stackexchange.com/questions/6732/why-is-an- Even-odd-split-faster-for-mergesort/6764#6764 (是的,我确实在那里发布了类似的答案(完全披露))
相关的维基百科文章指出归并排序是 O( n log(n ) ) 而奇偶归并排序的复杂度为 O( n log(n)^2 )。奇偶当然“慢”,但排序网络是静态的,因此您始终知道要执行什么操作并且(查看维基百科条目中的图形)注意算法如何保持并行直到最后。
当合并排序最终将 2 个列表合并在一起时,奇偶合并排序的 8 元素排序网络的最后比较仍然是独立的。
I suspect there is noise in your experiment. :) Some of it may come from compare-and-swap not actually moving any elements in the list which avoids cache invalidation, etc.
Regardless, there is a chat about this here: https://cstheory.stackexchange.com/questions/6732/why-is-an-even-odd-split-faster-for-mergesort/6764#6764 (and yes, I did post a similar answer there (full disclosure))
The related Wikipedia articles point out that mergesort is O( n log(n) ) while Odd-Even Merge Sort is O( n log(n)^2 ). Odd-Even is certainly "slower", but the sorting network is static so you always know what operations you are going to perform and (looking at the graphic in the Wikipedia entry) notice how the algorithm stays parallel until the end.
Where as merge sort finally merges 2 lists together the last comparisons of the 8-element sorting network for Odd-Even merge sort are still independent.