HashMap 中可以存储的键(对象)数量的理论限制?
HashMap 中可存储的键条目数量是否存在理论上的限制,或者最大值是否完全取决于可用的堆内存?
另外,哪种数据结构最适合存储大量对象(例如数十万个对象)?
Is there a theoretical limit for the number of key entries that can be stored in a HashMap or does the maximum purely depend on the heap memory available?
Also, which data structure is the best to store a very large number of objects (say several hundred thousand objects)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
查看该类的文档,我想说理论限制是
Integer.MAX_VALUE (231-1 = 2147483647) 个元素。
这是因为要正确实现此类,
size()
方法必须返回一个int
表示键/值对的数量。来自
的文档HashMap.size()
注意:这个问题与一个列表最多可以容纳多少数据。
我想说这取决于您需要存储的内容以及您需要什么类型的访问。所有内置集合可能都针对大量进行了很好的优化。
Looking at the documentation of that class, I would say that the theoretical limit is
Integer.MAX_VALUE
(231-1 = 2147483647) elements.This is because to properly implement this class, the
size()
method is obliged to return anint
representing the number of key/value pairs.From the documentation of
HashMap.size()
Note: This question is very similar to How many data a list can hold at the maximum.
I would say it depends on what you need to store and what type of access you require. All built in collections are probably well optimized for large quantities.
HashMap
将值保存在数组中,最多可容纳Integer.MAX_VALUE
。但这不计算碰撞。每个Entry
都有一个next
字段,它也是一个条目。这就是解决冲突(两个或多个具有相同哈希码的对象)的方法。所以我不会说有任何限制(除了可用内存)请注意,如果超过
Integer.MAX_VALUE
,您将从某些方法中得到意外的行为,例如size()
,但get()
和put()
仍然有效。它们会起作用,因为任何对象的hashCode()
都会返回一个int
,因此根据定义,每个对象都适合映射。然后每个对象都会与现有对象发生碰撞。HashMap
holds the values in an array, which can hold up toInteger.MAX_VALUE
. But this does not count collisions. EachEntry
has anext
field, which is also an entry. This is how collisions (two or more objects with the same hashcode) are resolved. So I wouldn't say there is any limit (apart from the available memory)Note that if you exceed
Integer.MAX_VALUE
, you'll get unexpected behaviour from some methods, likesize()
, butget()
andput()
will still work. And they will work, because thehashCode()
of any object will return anint
, hence by definition each object will fit in the map. And then each object will collide with an existing one.理论上没有限制,但是存储不同条目链(存储在不同的 hashkey 下)的存储桶有限制。一旦达到这个限制,每个新的添加都会导致哈希冲突——但这不是问题,除了性能......
There is no theoretical limit, but there is a limit of buckets to store different entry chains (stored under a different hashkey). Once you reach this limit every new addition will result in a hash collision -- but this is no a problem except for performance...
我同意@Bozho的观点,并且还补充说您应该阅读 Javadoc 仔细了解 HashMap。请注意它如何讨论初始容量和负载因子以及它们如何影响 HashMap 的性能。
HashMap 非常适合保存大量数据(只要不耗尽键或内存),但性能可能是一个问题。
如果您发现无法在单个 Java/JVM 程序中操作所需的数据集,您可能需要查看分布式缓存/数据网格。
I agree with @Bozho's and will also add that you should read the Javadoc on HashMap carefully. Note how it discusses the initial capacity and load factor and how they'll affect the performance of the HashMap.
HashMap is perfectly fine for holding large sets of data (as long as you don't run out of keys or memory) but performance can be an issue.
You may need to look in to distributed caches/data grids if you find you can't manipulate the datasets you need in a single Java/JVM program.