Java 列表排序:有没有办法像 TreeMap 一样让列表自动永久排序?
在Java中,您可以构建一个包含项目的ArrayList,然后调用:
Collections.sort(list, comparator);
是否可以像使用TreeMap那样在创建列表时传入比较器?
目标是能够将元素添加到列表中,而不是将其自动附加到列表末尾,列表将根据 Comparator
保持自身排序,并将新元素插入到列表中索引由Comparator
确定。因此基本上列表可能必须根据添加的每个新元素重新排序。
无论如何,是否可以通过 Comparator 或其他类似的方式来实现此目的?
In Java you can build up an ArrayList
with items and then call:
Collections.sort(list, comparator);
Is there anyway to pass in the Comparator at the time of list, creation like you can do with TreeMap
?
The goal is to be able add an element to the list and instead of having it automatically appended to the end of the list, the list would keep itself sorted based on the Comparator
and insert the new element at the index determined by the Comparator
. So basically the list might have to re-sort upon every new element added.
Is there anyway to achieve this in this way with the Comparator
or by some other similar means?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(17)
您可以更改 ArrayList 的行为
注意:PriorityQueue 不是 List,如果您不关心它是什么类型的集合,最简单的方法是使用 TreeSet,它就像 TreeMap 一样,但它是一个集合。 PriorityQueue 的唯一优点是允许重复。
注意:对于大型集合来说,重新排序并不是很有效,使用二分搜索并插入条目会更快。 (但更复杂)
编辑:很大程度上取决于您需要“列表”做什么。我建议您为 ArrayList、LinkedList、PriorityQueue、TreeSet 或其他排序集合之一编写一个列表包装器,并实现实际使用的方法。这样您就可以很好地了解集合的要求,并确保它适合您。
编辑(2):因为人们对使用二进制搜索非常感兴趣。 ;)
You can change the behaviour of ArrayList
Note: a PriorityQueue is NOT a List, if you didn't care what type of collection it was, the simplest would be to use a TreeSet, which is just like a TreeMap but is a collection. The only advantage PriorityQueue has is to allow duplicates.
Note: resorting is not very efficient for large collections, Using a binary search and inserting an entry would be faster. (but more complicated)
EDIT: A lot depends on what you need the "list" to do. I suggest you write a List wrapper for an ArrayList, LinkedList, PriorityQueue, TreeSet, or one of the other sorted collections and implement the methods which will actually be used. That way you have a good understanding of the requirements for the collection and you can make sure it works correctly for you.
EDIT(2): Since there was so much interest in using binarySearch instead. ;)
每个人都在建议
PriorityQueue
。但是,重要的是要认识到,如果您 迭代PriorityQueue
的内容,元素将不按排序顺序排列。您只能保证从peek()
、poll()
等方法中获得“最小”元素。TreeSet
似乎是更合适。需要注意的是,作为Set
,它不能包含重复的元素,并且不支持使用索引进行随机访问。Everyone is suggesting
PriorityQueue
. However, it is important to realize that if you iterate over the contents of aPriorityQueue
, the elements will not be in sorted order. You are only guaranteed to get the "minimum" element from the methodspeek()
,poll()
, etc.A
TreeSet
seems to be a better fit. The caveats would be that, as aSet
, it can't contain duplicate elements, and it doesn't support random access with an index.注释
JDK 中没有
SortedList
实现可能是有充分理由的。我个人想不出在 JDK 中进行自动排序的理由。它散发出过早优化出错的味道。如果读取列表的频率不如插入列表的频率,那么您就无缘无故地浪费了重复排序的周期。在读取之前进行排序会更具反应性,并且在某处使用一个布尔值来指示列表在读取之前是否需要进行排序会更好。
问题是,当使用
Iterator
或foreach
循环遍历列表时,您只真正关心顺序,因此在之前调用Collections.sort()
任何迭代的代码可能比尝试在每次插入时始终保持列表排序更高效。由于重复,
List
存在歧义,如何确定性地对重复项进行排序?有SortedSet
,由于其唯一性,这是有意义的。但是,对List
进行排序可能会因重复项和其他约束的副作用而变得更加复杂,例如使每个对象Comparable
或如我在代码中所示必须具有比较器可以代替完成这项工作。
对
.add()
进行排序如果您遇到一些非常特殊的情况,自动排序
List
会很有用,那么您可以做的一件事就是对进行子类化List
实现并重写.add()
来执行传递给自定义构造函数的Collections.sort(this, comparator)
。我使用LinkedList
而不是ArrayList
是有原因的,ArrayList
是一种自然的插入排序顺序List
。它还具有在索引处进行.add()
的能力,如果您想要一个不断排序的List
,则该功能非常无用,必须以某种方式处理,这可能会不太理想。根据 Javadoc;因此,它只是抛出
UnSupportedOperationException
是可以接受的,或者您可以忽略index
并委托给.add(Object element);
如果您记录它在方法的 JavaDoc 中。通常,当您需要大量插入/删除和排序时,您会使用
LinkedList
,因为使用“List”可以获得更好的性能特征。这是一个简单的示例:
最有效的解决方案:
或者,您只能在获取迭代器时进行排序,如果排序顺序仅在迭代列表时才真正重要,那么这将更加注重性能。 /代码>。这将涵盖客户端代码的用例,无需在每次迭代之前调用
Collections.sort()
并将该行为封装到类中。当然,需要进行错误检查和处理,以查看 Comparator 是否为 null 以及如果是这种情况该怎么办,但这给了您想法。您仍然没有任何确定的方法来处理重复项。
Guava 解决方案:
如果您正在使用 Guava 并且您应该使用 Guava,则可以使用
Ordering.immutableSortedCopy()
。Commentary
There is probably a good reason that there is no
SortedList
implementation in the JDK. I personally can't think of a reason to have one auto-sort in the JDK.It reeks of premature optimization gone wrong. If the list is not read from as often as it is inserted into, then you are wasting cycles sorting repeatedly for no reason. Sorting right before a read would be much more reactive and having a
boolean
somewhere indicating that the list does or does not need to be sorted before reading it would be even better.The thing is you only really care about order when traversing the list with an
Iterator
orfor each
loop, so callingCollections.sort()
before any code that iterates would probably be more performant than trying to keep the list sorted all the time on every insertion.There are ambiguities with
List
because of duplicates, how do you order duplicates deterministically? There isSortedSet
, and that makes sense because of the uniqueness. But sorting aList
can have more complications from the side effects of duplicates and other constraints like making every objectComparable
or as I show in my code having to have aComparator
that can do the work instead.Sorting on
.add()
If you have some very special situation where a auto-sorting
List
would be useful then one thing you might do is sub-class aList
implementation and over-ride.add()
to do aCollections.sort(this, comparator)
that you pass into a custom constructor. I usedLinkedList
instead ofArrayList
for a reason,ArrayList
is a natural insertion sorted orderList
to begin with. It also has the ability to.add()
at an index which is pretty useless if you want a constantly sortedList
, that would have to be handled in someway that would probably be less than ideal. According to the Javadoc;So it just throwing
UnSupportedOperationException
would be acceptable, or you could just ignore theindex
and delegate to.add(Object element);
if you document it in a JavaDoc on the method.Usually when you want to lots of inserts/removals and sorting you would use a
LinkedList
because of better performance characteristics given the usage of the `List'.Here is a quick example:
Most Efficient Solution:
Alternatively you could only sort when getting the
Iterator
and this would be more performance oriented if the sorted order was only really important when iterating over theList
. This would cover the use case of the client code not having to call,Collections.sort()
before every iteration and encapsulate that behavior into the class.Of course there would need to be error checking and handling to see if the
Comparator
wasnull
or not and what to do if that was the case, but this gives you the idea. You still don't have any deterministic way to deal with duplicates.Guava Solution:
If you are using Guava and you should be, you can just use
Ordering.immutableSortedCopy()
only when you need to iterate and be done with it.像 TreeSet(或者 TreeMultiset,如果你需要重复)这样的具有更高效随机访问的东西是可能的,但我怀疑它是用 Java 实现的。让树的每个节点记住其左子树的大小允许在
O(log(size))
时间内通过索引访问元素,这还不错。为了实现它,您需要重写底层 TreeMap 的很大一部分。
Something like TreeSet (or TreeMultiset in case you need duplicates) with more efficient random access is possible, but I doubt it was implemented in Java. Making each node of the tree remembers the size of its left subtree allows accessing an element by index in time
O(log(size))
which is not bad.In order to implement it, you'd need to rewrite a good portion of the underlying TreeMap.
SortedSet 和 List 之间的主要区别是:
您似乎想要两者的融合:自动排序和允许(合理的快速)索引访问。根据数据的大小以及索引读取或添加新元素的频率,这些是我的想法:
在任何情况下,SortedSet 的接口和契约和 List 并不真正兼容,因此您希望 List 部分是只读的(或只读和删除),不允许设置和添加,并有一个额外的对象(可能实现 Collection 接口)用于添加对象。
The main difference between SortedSet and List is:
You seem to want kind of a fusion of both: automatic sorting, and allowing (reasonable fast) index access. Depending on the size of data and how often indexed reading or adding new elements occur, these are my ideas:
In any case, the interfaces and contracts of SortedSet and List are not really compatible, so you'll want the List part be read-only (or read-and-delete-only), not allowing setting and adding, and having an extra object (maybe implementing the Collection interface) for adding Objects.
我会使用 Guava TreeMultiset 假设您需要一个
List
因为您可能有重复的元素。它会做你想做的一切。它不会有的一件事是基于索引的访问,考虑到您无论如何都没有将元素放在您选择的索引处,这没有多大意义。另一件需要注意的事情是,它实际上不会存储equal
对象的重复项......只是对它们的总数进行计数。I would use a Guava TreeMultiset assuming you want a
List
because you may have duplicate elements. It'll do everything you want. The one thing it won't have is index-based access, which doesn't make much sense given that you aren't putting elements at indices of your choosing anyway. The other thing to be aware of is that it won't actually store duplicates ofequal
objects... just a count of the total number of them.commons-collections 有
TreeBag
最初我建议使用
PriorityQueue
,但它的迭代顺序是未定义的,所以它没有用,除非你通过获取队列克隆的头部来迭代它,直到它变空。由于您很可能关心迭代顺序,因此我相信您可以覆盖
iterator()
方法:您可以通过存储排序集合的快照来改进这一点,并使用
modCount 验证集合是否未更改。
根据用例,这可能比彼得的建议更有效或更有效。例如,如果您添加多个项目并进行迭代。 (无需在迭代之间添加项目),那么这可能会更有效。
commons-collections have
TreeBag
Initially I suggested
PriorityQueue
, but its iteration order is undefined, so it's no use, unless you iterate it by getting the head of a clone of the queue until it gets empty.Since you are most likely concerned with the iteration order, I believe you can override the
iterator()
method:You can improve this by storing a snapshot of the sorted collection, and use
modCount
to verify whether the collection is not changed.Depending on the use-cases, this may be less or more efficient than Peter's suggestion. For example if you add multiple items, and iterate. (without adding items between iterations), then this might be more efficient.
显而易见的解决方案是创建您自己的类,该类实现 java.util.List 接口并采用 Comparator 作为构造函数的参数。您将在正确的位置使用比较器,即
add
方法将迭代现有项目并在正确的位置插入新项目。您将禁止调用诸如add(int index, Object obj)
等方法。事实上,有人已经创建了这个......快速谷歌搜索揭示了至少一个例子:
http://www.ltg.ed.ac.uk/NITE/nxt/apidoc/net/sourceforge/nite/util/SortedList.html
The obvious solution is to create your own class that implements the
java.util.List
interface and takes aComparator
as an argument to the constructor. You would use the comparator in the right spots, i.e. theadd
method would iterate through the existing items and insert the new item at the right spot. You would disallow calls to methods likeadd(int index, Object obj)
and so on.In fact, someone has to have created this already... a quick Google search reveals at least one example:
http://www.ltg.ed.ac.uk/NITE/nxt/apidoc/net/sourceforge/nite/util/SortedList.html
拥有任何排序结构且添加/indexOf/删除/获取元素的时间少于 O(n) 的唯一方法是使用树。在这种情况下,操作通常具有 O(log2n),而遍历类似于 O(1)。
O(n) 只是一个链表。
编辑:使用二分搜索插入链表。对于插入操作,不使用二进制结构,也不使用小尺寸,这应该是最佳的。
@彼得:
有一个算法,需要 O(log2n) 比较(很慢)来插入和 O(n) 移动。
如果您需要重写 LinkedList,那就这样吧。但这已经是最简洁的了。我尽可能保持算法简洁,以便于理解,可以稍微优化一下。
The only way to have any sorted structure with less than O(n) time to add/indexOf/remove/get element is using a tree. In that case operations generally have O(log2n) and traverse is like O(1).
O(n) is just a linked list.
Edit: inserting into linked list w/ binary search. For inserts operations, not using binary structure, and not small sizes, that should be optimal.
@Peter:
There is the algo w/ O(log2n) compares (which are slow) to insert and O(n) moves.
If you need to override LinkedList, so be it. But that's as neat as it can get. I keep the algorithm as clean as possible to be easily understandable, it can be optimized a little.
考虑一下我在面临类似问题时创建的 indexed-tree-map,您将能够通过索引访问元素并获取元素的索引,同时保持排序顺序。重复项可以作为同一键下的值放入数组中。
Consider indexed-tree-map that I created while facing a similar problem, you will be able to access elements by index and get index of elements while keeping the sort order. Duplicates can be put into arrays as values under the same key.
在 JavaFX TransformationList 层次结构中,有一个称为 SortedList 的东西。该列表是完全可观察的,因此添加/删除将通知观看该列表的任何其他侦听器。
执行此操作的基本方法是观察另一个 ObservableList 的更改,并策略性地使用 Collections.binarySearch() ,正如其他人建议的那样,在 Olog(n) 时间内找到添加或删除的索引。
有一个问题我在这里没有看到提到,那就是跟踪具有相同 compareTo 签名的添加项的能力,即 T1.compareTo(T2) == 0。在这种情况下,排序的list(我将在下面发布我自己的源代码)必须有一个包装元素类型,我将其称为 Element。这与 JavaFX 的创建者对 SortedList 所做的类似。造成这种情况的原因完全是由于删除操作,如果存在compareTo重复项,则无法找到原始元素。通常在像TreeSet这样的NavigableSet实现中,这些重复项永远不会进入Set。清单是不同的。
我有一个可观察列表库,可以有效地链接在一起(与 Java Streams 非常相似),当链中的前一个源更新时,它可以将结果完全传播到下游。
类层次结构
接口
所有绑定类型(Sort、Distinct、Map、FlatMap 等)的抽象基类
排序绑定类< /strong>
包装元素类
JUNIT VERIFICATION TEST
JUNIT BENCHMARK TEST
仅用于测试的包装类
In the JavaFX TransformationList hierarchy, there is something called a SortedList. The list is entirely observable so that additions/removals will notify any other listeners watching the list.
The basic approach to doing this is you watch another ObservableList for changes and strategically use Collections.binarySearch() as others have suggested to locate the index of the addition or removal in Olog(n) time.
There is one problem that I have not seen mentioned here and that is the ability to track items added that have the same compareTo signature, i.e. T1.compareTo(T2) == 0. In this case the sorted list (I will post my own source code below) must have a wrapper element type, that I will call Element. This is similar to what the creators in JavaFX did with SortedList. The reason for this is entirely due to removal operations, it is impossible to locate the original element if there are compareTo duplicates. Normally in a NavigableSet implementation like TreeSet, these duplicates would never enter the Set. A list is different.
I have a library of observable lists that can be effectively chained together (very similar to Java Streams) that fully propagate results downstream as the previous source in the chain updates.
Class Hierarchy
Interface
Abstract Base Class for all Binding Types (Sort, Distinct, Map, FlatMap, etc.)
Sort Binding Class
Wrapper Element Class
JUNIT VERIFICATION TEST
JUNIT BENCHMARK TEST
Wrapper Class for Tests Only
我还发现令人难以置信的是,Java 标准库中不存在这种情况。 (但是祝你好运,向 JDK 团队提议添加任何新类!我从来没有这样的运气。)
假设你的
compareTo
函数是一个适当的传递关系,那么最快的算法(假设列表的读取次数与写入次数大致相同)的方法是使用执行二分搜索的方法覆盖List.add
,以在插入新项目之前找到新项目的插入点。添加元素的数量为 O(log(N))。I also find it mind-boggling that this does not exist in the Java standard libraries. (But good luck with proposing the addition of any new class to the JDK team! I have never had luck with that.)
Assuming that your
compareTo
function is a proper transitive relation, then the fastest algorithm for this (assuming the list is read approximately as many times as it is written) is to overrideList.add
with a method that performs a binary search to find the insertion point of the new item before inserting it. This is O(log(N)) in the number of added elements.最好的方法是重写列表的 add 实现。
我将使用 LinkedList 来演示它,因为它允许高效插入。
上面的代码创建了一个整数的排序列表,该列表始终是排序的。它可以轻松修改以与任何其他数据类型一起使用。但是,在这里您必须避免使用 add(index, value) 函数,因为这显然会破坏排序。
尽管上面的人建议使用 Arrays.sort(),但我会避免这样做,因为它可能是一种效率明显较低的方法,特别是因为每次添加到列表时都必须调用排序方法。
The best way to do this would be to override the add implementation of a list.
I'm going to use a LinkedList to demonstrate it, as it allows for efficient insertion.
The above code creates a sorted list of integers, that is always sorted. It can easily be modified to work with any other datatype. However here you will have to avoid using the
add(index, value)
function, as that would obviously break the sorting.Although people above suggested using Arrays.sort(), I would avoid that, as it can be a significantly less efficient approach, especially since the sort method must be called with every addition to the list.
ListIterator 接口的约定使其有点麻烦,但此方法将使用列表的单次扫描(直到插入点)来执行插入:
The contract of the ListIterator interface makes it a bit cumbersome, but this method will perform the insertion using a single scan of the list (up to the insertion point):
SortedSet
SortedSet
接口的任何实现都可以实现您想要的行为。默认情况下,添加的对象按其自然顺序排序,即基于其
Comparable::compareTo
接口方法。或者,您可以传递
Comparator
实现确定排序。
TreeSet
TreeSet
是SortedSet
的常用植入。您还可以找到其他人。重复
List
和SortedSet
之间的主要区别在于重复,即比较相等的对象。List
允许重复,而SortedSet
与任何Set
一样不允许重复。通过索引访问
另一个区别是
Set
不能通过索引访问。您无法通过对象在集合中的位置编号来定位对象。如果您在构建
SortedSet
后需要此类访问,请创建一个List
。有多种方法可以做到这一点,例如将SortedSet
传递给ArrayList
的构造函数。从 Java 10 开始,最近的一种方法是通过将SortedSet
传递给List.copyOf
来创建可修改的List
。SortedSet
Any implementation of the
SortedSet
interface carries your desired behavior.By default, objects added are sorted in their natural order, that is, based on their implementation of the
Comparable::compareTo
interface method.Alternatively, you can pass a
Comparator
implementation to determine the sort.TreeSet
The
TreeSet
is a commonly used implantation ofSortedSet
. You can also find others.Duplicates
The major difference between a
List
and aSortedSet
is duplicates, objects that compare as equal. AList
allows duplicates whereas aSortedSet
, like anySet
, does not.Access by index
Another difference is that a
Set
cannot be accessed by index. You can not locate an object by its position number within the collection.If you need such access after constructing your
SortedSet
, make aList
. There are multiple ways to do this, such as passing theSortedSet
to constructor ofArrayList
. A mote recent way, as of Java 10, is to make an umodifiableList
by passing theSortedSet
toList.copyOf
.正如之前明确指出的,对
SortedSet
上的排序列表的需求是对索引和重复项的需求。我两者都需要。从 Java8 开始,
java.util.List
具有“尽责”List.sort()
方法。我的列表在被引用之前已预先加载,并且加载顺序几乎总是顺序;参考文献需要非常快。因此,对
user177800
的答案 的唯一更改是我遵循现在提供的sort:我不知道前面会保留多少项,所以
LinkedList
。Aaa现在我注意到
Collections.sort(Listlist, Comparator c)
现在遵循List.sort(c)
。As stated clearly prior, the need for a sorted list over a
SortedSet
is the need for indexing and duplicates. I needed both.As of Java8,
java.util.List<E>
has a "conscientious"List<E>.sort()
method.My list is pre-loaded prior to being referenced and load-order is almost always the order; references need to be very fast. Therefore, the only change to
user177800
's answer is for me to defer to the now provided sort:I don't know how many items will be held ahead, so
LinkedList<E>
.Aaaand now I notice that
Collections.sort(List<T> list, Comparator<? super T> c)
now defers toList.sort(c)
. ????我相信 优先级队列 就可以了工作。
警告(来自同一文档页面):
I believe a Priority Queue will do the job.
Caveat (from the same doc page):