为什么哈希映射比 trie 映射更好?
我所说的 trie 映射是指关联数组,其中有效负载存储在 trie 中,而不是哈希表中。
当我使用哈希映射/表时,我使用的键通常是字符串。与某些基于 trie 的映射相比,哈希映射有哪些优点?我读过哈希映射更快 - 但在我看来,一致的哈希函数必须检查 (char) 数组的每个元素以获得最终哈希 - 迭代数组一次。在 trie 中,您同样只需迭代数组一次。
在我看来,在编码小对象时,这确实会使用更多的内存(即使你只允许在键中使用小写字母字符,每个节点有 26 个指针,并且每个键通常有多个节点),但从好的方面来说,你永远不必担心调整大小。为什么哈希映射如此常见,但我从未见过特里映射?
By trie map I mean an associative array, where the payloads are stored in a trie instead of a hash table.
When I'm using a hash map/table, the keys I use are typically strings. What are the advantages of a hash map over some trie based map? I have read that a hash map is faster - but it seems to me that a consistent hash functions would have to check each element of the (char) array for the final hash - iterating over the array once. In a trie you would similarly have to iterate over the array just once.
It does seem to me that this would use a lot more memory when encoding small objects (even if you only allow lower case alpha characters in the keys, it's 26 pointers per node and often multiple nodes per key), but on the plus side you never have to worry about resizing. Why is it that hash maps are so common, but I've never seen a trie map?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您是 Scala 程序员,TrieMap 是“哈希数组映射 trie 的并发线程安全无锁实现”。目前 Java 标准库中还没有。
Just in case you are a Scala programmer, TrieMap is a "concurrent thread-safe lock-free implementation of a hash array mapped trie". There's none in Java standard lib this moment.
哈希映射比 trie 映射更常见,因为它们更通用:它们可以用于任何可哈希的对象,而 trie 则适用于序列。哈希表在常见实现中还具有更好的引用局部性,因为它们将元素存储在一起。
(严格来说,每个对象都是一个位序列,但是通用 trie 会要求用户在将对象存储在 trie 中之前对其进行序列化。与定义自定义哈希函数相比,这非常不方便。)
Hash maps are more common than trie maps because they are more generic: they can be made to work on any object that is hashable, while a trie works on sequences. Hash tables also have better locality of reference in common implementations because they store elements close together.
(Strictly speaking, every object is a sequence of bits, but then a generic trie would require the user to serialize their object before storing it in the trie. That's pretty inconvenient compared to defining custom hash functions.)