选择 Java Collection 实现的经验法则?
在 Java Collection 接口(如 List、Map 或 Set)的不同实现之间进行选择时,有人有一个好的经验法则吗?
例如,通常为什么或在什么情况下我更喜欢使用 Vector 或 ArrayList、Hashtable 或 HashMap?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
正如其他答案中所建议的,根据用例,有不同的场景可以使用正确的集合。 我列出了几点,
ArrayList:
LinkedList:
HashSet:
对某个项目做出其他是/否决策,例如“该项目是英语单词吗”、“该项目是否在数据库中?” ,“该商品属于该类别吗?” 等等
记住“您已经处理过哪些项目”,例如在进行网络抓取时;
HashMap:
对于给定的用户 ID,其缓存的名称/用户对象是什么?
Vector 和 Hashtable 是同步的,因此速度较慢,如果需要同步,请使用 Collections.synchronizedCollection()。
检查此以获取已排序的集合。
希望这有帮助。
As suggested in other answers, there are different scenarios to use correct collection depending on use case. I am listing few points,
ArrayList:
LinkedList:
HashSet:
Making other yes-no decisions about an item, e.g. "is the item a word of English", "is the item in the database?" , "is the item in this category?" etc.
Remembering "which items you've already processed", e.g. when doing a web crawl;
HashMap:
For a given user ID, what is their cached name/User object?.
Vector and Hashtable are synchronized and therefore bit slower and If synchronization is needed, use Collections.synchronizedCollection().
Check This for sorted collections.
Hope this hepled.
列表允许重复的项目,而集合仅允许一个实例。
每当我需要执行查找时,我都会使用 Map。
对于具体的实现,映射和集合存在保序变体,但很大程度上取决于速度。 我倾向于将 ArrayList 用于相当小的列表,将 HashSet 用于相当小的集合,但是有很多实现(包括您自己编写的任何实现)。 HashMap 对于 Map 来说非常常见。 任何超过“相当小的”的东西,你都必须开始担心内存,这样在算法上就会更加具体。
此页面有大量动画图像以及示例代码如果您对硬数字感兴趣,请测试 LinkedList 与 ArrayList。
编辑:我希望以下链接演示了这些东西实际上只是工具箱中的项目,您只需考虑您的需求是什么:请参阅 地图, 列表 和 设置。
Lists allow duplicate items, while Sets allow only one instance.
I'll use a Map whenever I'll need to perform a lookup.
For the specific implementations, there are order-preserving variations of Maps and Sets but largely it comes down to speed. I'll tend to use ArrayList for reasonably small Lists and HashSet for reasonably small sets, but there are many implementations (including any that you write yourself). HashMap is pretty common for Maps. Anything more than 'reasonably small' and you have to start worrying about memory so that'll be way more specific algorithmically.
This page has lots of animated images along with sample code testing LinkedList vs. ArrayList if you're interested in hard numbers.
EDIT: I hope the following links demonstrate how these things are really just items in a toolbox, you just have to think about what your needs are: See Commons-Collections versions of Map, List and Set.
我假设您从上面的答案中知道列表、集合和映射之间的区别。 为什么要在它们的实现类之间进行选择是另一回事。 例如:
List:
Set:
Map: HashMap 和 TreeMap 的性能和行为与 Set 实现是并行的。
不应使用向量和哈希表。 它们是同步实现,在新的 Collection 层次结构发布之前,因此速度很慢。 如果需要同步,请使用 Collections.synchronizedCollection()。
I'll assume you know the difference between a List, Set and Map from the above answers. Why you would choose between their implementing classes is another thing. For example:
List:
Set:
Map: The performance and behavior of HashMap and TreeMap are parallel to the Set implementations.
Vector and Hashtable should not be used. They are synchronized implementations, before the release of the new Collection hierarchy, thus slow. If synchronization is needed, use Collections.synchronizedCollection().
我真的很喜欢 Sergiy Kovalchuk 博客文章中的备忘单,但不幸的是它已离线。 然而,Wayback Machine 有一个 历史副本:
更详细的是 Alexander Zagniotov 的流程图,也是离线的,因此也是历史的 博客副本:
摘自博客中关于评论中提出的问题的内容:
“这份备忘单不包括很少使用的类,如 WeakHashMap、LinkedList 等,因为它们是为非常具体或奇异的任务而设计的,99% 的情况下都不应该选择它们。”
I really like this cheat sheet from Sergiy Kovalchuk's blog entry, but unfortunately it is offline. However, the Wayback Machine has a historical copy:
More detailed was Alexander Zagniotov's flowchart, also offline therefor also a historical copy of the blog:
Excerpt from the blog on concerns raised in comments:
"This cheat sheet doesn't include rarely used classes like WeakHashMap, LinkedList, etc. because they are designed for very specific or exotic tasks and shouldn't be chosen in 99% cases."
嗯,这取决于你需要什么。 一般准则是:
列表是一个集合,其中数据按插入顺序保存,每个元素都有索引。
Set是一袋不重复的元素(如果重新插入相同的元素,则不会添加它)。 数据没有顺序的概念。
地图 您可以通过数据元素的键来访问和写入数据元素,该键可以是任何可能的对象。
出处:https://stackoverflow.com/a/21974362/2811258
有关 Java 集合的更多信息,查看这篇文章。
Well, it depends on what you need. The general guidelines are:
List is a collection where data is kept in order of insertion and each element got index.
Set is a bag of elements without duplication (if you reinsert the same element, it won't be added). Data doesn't have the notion of order.
Map You access and write your data elements by their key, which could be any possible object.
Attribution: https://stackoverflow.com/a/21974362/2811258
For more information about Java Collections, check out this article.
对于非排序的最佳选择,十有八九是:ArrayList、HashMap、HashSet。
Vector 和 Hashtable 是同步的,因此可能会慢一些。 您很少需要同步实现,并且当您这样做时,它们的接口不够丰富,无法让同步发挥作用。 对于 Map,ConcurrentMap 添加了额外的操作以使接口变得有用。 ConcurrentHashMap是ConcurrentMap的一个很好的实现。
LinkedList 几乎从来都不是一个好主意。 即使您正在进行大量插入和删除操作,如果您使用索引来指示位置,则需要迭代列表以找到正确的节点。 ArrayList 几乎总是更快。
对于 Map 和 Set,散列变体将比树/排序更快。 哈希算法往往具有 O(1) 性能,而树则为 O(log n)。
For non-sorted the best choice, more than nine times out of ten, will be: ArrayList, HashMap, HashSet.
Vector and Hashtable are synchronised and therefore might be a bit slower. It's rare that you would want synchronised implementations, and when you do their interfaces are not sufficiently rich for thier synchronisation to be useful. In the case of Map, ConcurrentMap adds extra operations to make the interface useful. ConcurrentHashMap is a good implementation of ConcurrentMap.
LinkedList is almost never a good idea. Even if you are doing a lot of insertions and removal, if you are using an index to indicate position then that requires iterating through the list to find the correct node. ArrayList is almost always faster.
For Map and Set, the hash variants will be faster than tree/sorted. Hash algortihms tend to have O(1) performance, whereas trees will be O(log n).
关于你的第一个问题...
列表、地图和集合有不同的用途。 我建议阅读有关 Java 集合框架的信息,网址为 http://java.lang. sun.com/docs/books/tutorial/collections/interfaces/index.html。
更具体一点:
关于你的第二个问题...
Vector和ArrayList之间的主要区别是前者是同步的,后者不是同步的。 您可以在Java 并发实践中了解有关同步的更多信息。
Hashtable(注意T不是大写字母)和HashMap的区别类似,前者是同步的,后者不是同步的。
我想说,没有优先选择一种实现或另一种实现的经验法则,这实际上取决于您的需求。
About your first question...
List, Map and Set serve different purposes. I suggest reading about the Java Collections Framework at http://java.sun.com/docs/books/tutorial/collections/interfaces/index.html.
To be a bit more concrete:
About your second question...
The main difference between Vector and ArrayList is that the former is synchronized, the latter is not synchronized. You can read more about synchronization in Java Concurrency in Practice.
The difference between Hashtable (note that the T is not a capital letter) and HashMap is similiar, the former is synchronized, the latter is not synchronized.
I would say that there are no rule of thumb for preferring one implementation or another, it really depends on your needs.
理论上,有一些有用的权衡,但实际上这些几乎无关紧要。
在现实世界的基准测试中,即使使用大列表和“在前面进行大量插入”等操作,
ArrayList
的性能也优于LinkedList
。 学者们忽略了这样一个事实:真正的算法具有可以压倒渐近曲线的常数因素。 例如,链表需要为每个节点进行额外的对象分配,这意味着创建节点的速度较慢,并且内存访问特性也较差。我的规则是:
Theoretically there are useful Big-Oh tradeoffs, but in practice these almost never matter.
In real-world benchmarks,
ArrayList
out-performsLinkedList
even with big lists and with operations like "lots of insertions near the front." Academics ignore the fact that real algorithms have constant factors that can overwhelm the asymptotic curve. For example, linked-lists require an additional object allocation for every node, meaning slower to create a node and vastly worse memory-access characteristics.My rule is:
我总是根据具体情况做出这些决定,具体取决于用例,例如:
然后我拿出我方便的第 5 版 Java in a Nutshell 并比较了大约 20 个选项。 第五章中有一些漂亮的小表格,可以帮助人们找出合适的内容。
好吧,也许如果我即兴知道一个简单的 ArrayList 或 HashSet 就可以解决问题,我就不会全部查找了。 ;)但如果我的预期用途有任何复杂的地方,你敢打赌我在书中。 顺便说一句,我认为 Vector 应该是“老帽子”——我已经很多年没有使用过了。
I've always made those decisions on a case by case basis, depending on the use case, such as:
And then I break out my handy 5th edition Java in a Nutshell and compare the ~20 or so options. It has nice little tables in Chapter five to help one figure out what is appropriate.
Ok, maybe if I know off the cuff that a simple ArrayList or HashSet will do the trick I won't look it all up. ;) but if there is anything remotely complex about my indended use, you bet I'm in the book. BTW, I though Vector is supposed to be 'old hat'--I've not used on in years.
使用
Map
进行键值配对对于 键值 跟踪,使用
Map
实现。例如,跟踪哪个人在周末的哪一天报道。 所以我们想要映射一个
DayOfWeek
对象到Employee
对象。选择
Map
的实现,有几个方面需要考虑。 其中包括:并发性、键和/或值中 NULL 值的容忍度、迭代键时的顺序、通过引用与内容进行跟踪以及文字语法的便利性。这是我制作的图表,显示了十个
Map
与 Java 11 捆绑在一起的实现。Use
Map
for key-value pairingFor key-value tracking, use
Map
implementation.For example, tracking which person is covering which day of the weekend. So we want to map a
DayOfWeek
object to anEmployee
object.When choosing one of the
Map
implementations, there are several aspects to consider. These include: concurrency, tolerance for NULL values in key and/or value, order when iterating keys, tracking by reference versus content, and convenience of literals syntax.Here is a chart I made showing the various aspects of each of the ten
Map
implementations bundled with Java 11.我发现 Bruce Eckel 的《Java 思维》非常有帮助。 他很好地比较了不同的收藏。 我曾经在我的立方体墙上保留了他发布的显示继承层次结构的图表,作为快速参考。 我建议您做的一件事是牢记线程安全。 性能通常意味着线程不安全。
I found Bruce Eckel's Thinking in Java to be very helpful. He compares the different collections very well. I used to keep a diagram he published showing the inheritance heirachy on my cube wall as a quick reference. One thing I suggest you do is keep in mind thread safety. Performance usually means not thread safe.