在频繁插入排序的列表中插入项目
我有一个经常进行插入排序的列表。是否有一个好的位置(除了末尾)可以添加到此列表以最小化插入排序必须完成的工作?
I have a list that is frequently insertion sorted. Is there a good position (other than the end) for adding to this list to minimize the work that the insertion sort has to do?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
插入的最佳位置是元素在排序列表中所属的位置。这类似于抢先插入排序。
The best place to insert would be where the element belongs in the sorted list. This would be similar to preemptively insertion sorting.
你的问题没有意义。列表要么是插入排序的(这意味着您不能根据定义追加到末尾;该元素仍将位于它所属的位置。否则,列表将不会被排序)。
如果您必须添加大量元素,那么最好的解决方案是克隆列表,添加所有元素,对新列表进行一次排序,然后用克隆替换第一个列表。
[编辑]回复您的评论:在进行了几次附加之后,您必须对列表进行排序,然后才能进行下一个排序插入。因此,问题不在于如何使排序插入更便宜,而在于附加和排序插入之间的排序。
答案是,大多数排序算法对于部分排序的列表都表现得很好。您需要问的问题是:使用什么排序算法,它有什么属性,最重要的是,您为什么要关心。
最后一个问题意味着您应该在进行任何类型的优化之前衡量性能,因为除非基于实际数字,否则它有 90% 的机会弊大于利。
回到排序。 Java 使用快速排序的一个版本来对集合进行排序。快速排序将选择一个枢轴元素来对集合进行分区。这种选择对于算法的性能至关重要。为了获得最佳性能,枢轴元素应尽可能靠近结果中间的元素。通常,快速排序使用当前分区中间的元素作为主元。此外,快速排序将开始处理具有小索引的列表。
因此,最后添加新元素可能不会给您带来良好的性能。它不会影响主元元素的选择,但快速排序将在检查完所有已排序元素后查看新元素。在中间添加新元素会影响枢轴选择,我们无法真正判断这是否会对性能产生影响。我的直觉猜测是,如果快速排序在分区中间找到已排序的元素,则主元元素会更好。
这就需要在开始时添加新元素。这样,快速排序通常会找到一个完美的主元元素(因为列表的中间将被排序),并且它会首先选取新元素。缺点是每次插入都必须复制整个数组。有两种方法可以避免这种情况:a) As我在其他地方说过,当今的 PC 几乎可以立即复制大量 RAM,因此您可以忽略这一小小的性能影响。 b) 您可以使用第二个 ArrayList,将所有新元素放入其中,然后使用
addAll()
。 Java 会在内部针对这种情况做一些优化,只移动现有元素一次。[EDIT2] 我完全误解了你的问题。对于算法插入排序来说,最好的地方可能是中间的某个地方。这将使您必须在整个列表中移动元素的机会减半。但由于我不是 100% 确定,我建议创建几个小测试来验证这一点。
Your question doesn't make sense. Either the list is insertion sorted (which means you can't append to the end by definition; the element will still end up in the place where it belongs. Otherwise, the list wouldn't be sorted).
If you have to add lots of elements, then the best solution is to clone the list, add all elements, sort the new list once and then replace the first list with the clone.
[EDIT] In reply to your comments: After doing a couple of appends, you must sort the list before you can do the next sorted insertion. So the question isn't how you can make the sorted insertion cheaper but the sort between appends and sorted insertions.
The answer is that most sorting algorithms do pretty good with partially sorted lists. The questions you need to ask are: What sorting algorithm is used, what properties does it have and, most importantly, why should you care.
The last question means that you should measure performance before you do any kind of optimization because you have a 90% chance that it will hurt more than it helps unless it's based on actual numbers.
Back to the sorting. Java uses a version of quicksort to sort collections. Quicksort will select a pivot element to partition the collection. This selection is crucial for the performance of the algorithm. For best performance, the pivot element should be as close to the element in the middle of the result as possible. Usually, quicksort uses an element from the middle of the current partition as a pivot element. Also, quicksort will start processing the list with the small indexes.
So adding the new elements at the end might not give you good performance. It won't affect the pivot element selection but quicksort will look at the new elements after it has checked all the sorted elements already. Adding the new elements in the middle will affect the pivot selection and we can't really tell whether that will have an influence on the performance or not. My instinctive guess is that the pivot element will be better if quicksort finds sorted elements in the middle of the partitions.
That leaves adding new elements at the beginning. This way, quicksort will usually find a perfect pivot element (since the middle of the list will be sorted) and it will pick up the new elements first. The drawback is that you must copy the whole array for every insert. There are two ways to avoid that: a) As I said elsewhere, todays PCs copy huge amounts of RAM in almost no time at all, so you can just ignore this small performance hit. b) You can use a second ArrayList, put all the new elements in it and then use
addAll()
. Java will do some optimizations internally for this case and just move the existing elements once.[EDIT2] I completely misunderstood your question. For the algorithm insertion sort, the best place is probably somewhere in the middle. This should halve the chances that you have to move an element through the whole list. But since I'm not 100% sure, I suggest to create a couple of small tests to verify this.