Java 中 HashTable 的自定义实现?
我正在解决 Quora 问题,对于我的特定解决方案,我需要一个哈希表(长键、整数值) )用于缓存值。我希望 Java HashMap 能够得到改进,因为我知道键和值的数据类型,它们是原语,也是我的问题空间。我决定天真地继续使用“链表数组”结构实现一个简单的哈希表(甚至我的 linkedList 是我自己实现的 Node 类)。但我注意到我自己的简单实现比通用 Java HashMap 慢大约 4 倍。我还尝试使用 Trove 的 LongToIntMap 库来查看它们的作用。有没有人有任何好的建议来在 Java 中构建一个自定义的 Long 到 Int 哈希表,其性能显着优于 Java HashMap?
I was solving the Quora problem and for my particular solution I needed a hashtable (long-keys, int-values) for caching values. I hoped that the Java HashMap could be improved because I knew my data types for the keys and values and they were primitives and also my problem space. I decided to naively go ahead implement a simple hashtable using the "array-of-linkedlist" structure (even my linkedList was my own Node class I had implemented). But I noticed that my own naive implementation was about 4x slower than the generic Java HashMap. I also tried to use Trove's LongToIntMap library to see what they do. Does anyone have any good suggestions to build a custom Long to Int hashtable in Java that significantly outperforms the Java HashMap?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
看看 Javolution 的 FastMap。源代码可在此处获取。
Have a look at Javolution's FastMap. Source code is available here.
您是否尝试查看代码以了解他们是如何做到的?
如果不查看代码,就不可能确定您在实现中做错了什么。然而,一种可能的改进可能是将
LinkedList
替换为使用int[]
表示列表的自定义“整数列表”类型。根据您的哈希表 API,您应该能够避免将值表示为对象(特别是Integer
)的成本。 (作为推论,通过不实现具有键和/或值类型的泛型类型的 API,您将获得更好的性能和空间利用率。)无论如何,可能导致性能不佳的一个错误是忽略实施哈希表调整大小。如果不调整大小,表上的
get
和put
操作的复杂度将为O(N)
而不是O(1)< /code> ...因为哈希链长度将与哈希表条目的数量成比例增长。
最后,您需要清楚自己是针对性能还是针对空间利用率进行优化。最佳解决方案会有所不同......
Did you try looking at the code to see how they do it?
It is not possible to say say for sure what you did wrong in your implementation without looking at the code. However one possible improvement might be to replace the
LinkedList<Integer>
with a custom "list of integer" type that usesint[]
to represent the lists. Depending on your hash table API, you should be able to avoid the cost of representing your values as objects (specificallyInteger
s). (And as a corollary, you will get better performance and space utilization by NOT implementing an API that has generic types for the key and/or value types.)For what it is worth, one mistake that could lead to poor performance is neglecting to implement hashtable resizing. Without resizing, the complexity of
get
andput
operations on the table will beO(N)
rather thanO(1)
... because the hash chain lengths will grow in proportion to the number of hash table entries.Finally, you need to be clear in your mind whether you are optimizing for performance or space utilization. The optimal solution will be different ....