创建一个包含两个数组的哈希表
有谁知道如何做到这一点以及伪代码是什么样子?
众所周知,哈希表存储键、值对,当调用某个键时,该函数将返回与该键关联的值。我想做的是了解创建映射函数的底层结构。例如,如果我们生活在一个除了数组之外没有以前定义的函数的世界,我们如何复制我们今天拥有的哈希图?
Does anyone know how to do this and what the pseudo code would look like?
As we all know a hash table stores key,value pairs and when a key is a called, the function will return the value associated with that key. What I want to do is understand the underlying structure in creating that mapping function. For example, if we lived in a world where there were no previously defined functions except for arrays, how could we replicate the Hashmaps that we have today?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
实际上,今天的一些 Hashmap 实现确实是由数组组成的,正如您所建议的那样。让我概述一下它是如何工作的:
哈希函数
哈希函数将您的键转换为第一个数组(数组 K)的索引。为此,可以使用 MD5 等哈希函数或更简单的哈希函数(通常包括模运算符)。
桶
一个简单的基于数组的 Hashmap 实现可以使用存储桶来应对冲突。数组 K 中的每个元素(“桶”)本身包含一个对的数组(数组 P)。添加或查询元素时,哈希函数会将您指向 K 中的正确存储桶,其中包含所需的数组 P。然后您迭代 P 中的元素,直到找到匹配的键,或者在P 结尾。
使用哈希将键映射到存储桶
您应该确保桶的数量(即 K 的大小)是 2 的幂,比方说 2^b。要找到某个键的正确存储桶索引,请计算 Hash(key) 但仅保留前 b 位。这是转换为整数时的索引。
重新缩放
计算密钥的哈希值并找到正确的存储桶非常快。但是,一旦桶变得更满,您将不得不迭代越来越多的项目才能找到正确的项目。因此,拥有足够的存储桶来正确分配对象非常重要,否则你的 Hashmap 会变得很慢。
因为您通常不知道要提前在 Hashmap 中存储多少对象,所以需要动态增大或缩小映射。您可以对存储的对象数量进行计数,一旦超过某个阈值,您就重新创建整个结构,但这次数组 K 的大小更大或更小。这样,K 中的一些存储桶就被保留了。 very full 现在会将它们的元素分配到几个桶中,这样性能会更好。
替代方案
您还可以使用二维数组来代替数组的数组,或者可以将数组 P 替换为链表。此外,您可以选择在其中一个存储桶包含超过某个配置数量的项目时重新创建(即重新缩放)哈希图,而不是保留存储对象的总数。
您所要求的内容的变体在哈希表维基百科条目中被描述为“数组哈希表”。
代码
有关代码示例,请查看此处。
希望这有帮助。
Actually, some of todays Hashmap implentations are indeed made out of arrays as you propose. Let me sketch how this works:
Hash Function
A hash function transforms your keys into an index for the first array (array K). A hash function such as MD5 or a simpler one, usually including a modulo operator, can be used for this.
Buckets
A simple array-based Hashmap implementation could use buckets to cope with collissions. Each element ('bucket') in array K contains itself an array (array P) of pairs. When adding or querying for an element, the hash function points you to the correct bucket in K, which contains your desired array P. You then iterate over the elements in P until you find a matching key, or you assign a new element at the end of P.
Mapping keys to buckets using the Hash
You should make sure that the number of buckets (i.e. the size of K) is a power of 2, let's say 2^b. To find the correct bucket index for some key, compute Hash(key) but only keep the first b bits. This is your index when cast to an integer.
Rescaling
Computing the hash of a key and finding the right bucket is very quick. But once a bucket becomes fuller, you will have to iterate more and more items before you get to the right one. So it is important to have enough buckets to properly distribute the objects, or your Hashmap will become slow.
Because you generally don't know how much objects you will want to store in the Hashmap in advance, it is desirable to dynamically grow or shrink the map. You can keep a count of the number of objects stored, and once it goes over a certain threshold you recreate the entire structure, but this time with a larger or smaller size for array K. In this way some of the buckets in K that were very full will now have their elements divided among several buckets, so that performance will be better.
Alternatives
You may also use a two-dimensional array instead of an array-of-arrays, or you may exchange array P for a linked list. Furthermore, instead of keeping a total count of stored objects, you may simply choose to recreate (i.e. rescale) the hashmap once one of the buckets contains more than some configured number of items.
A variation of what you are asking is described as 'array hash table' in the Hash table Wikipedia entry.
Code
For code samples, take a look here.
Hope this helps.
你能更精确一点吗?一个数组是否包含键,另一个数组包含值?
如果是这样,这里有一个 Java 示例(但这里有一些这种语言的特殊性):
当然,您必须实例化您的
map
对象(如果您使用 Java,我建议使用HashMap
而不是过时的HashTable
),并测试您的数组以避免null
对象并检查它们是否具有相同的尺寸。Could you be more precise? Does one array contain the keys, the other one the values?
If so, here is an example in Java (but there are few specificities of this language here):
Of course, you will have to instantiate your
map
object (if you are using Java, I suggest to use aHashMap<Object, Object>
instead of an obsoleteHashTable
), and also test your arrays in order to avoidnull
objects and check if they have the same size.示例说明:
在下面的源代码中,基本上它做了两件事:
1. 地图表示
示例:
注意:这是数组的数组,而不是两个数组(我看不到可能的通用哈希图,只有 2 个数组就很好)
如果您知道算法 >图论>邻接表,这看起来完全一样。
2. 哈希函数
哈希函数将字符串(输入)转换为数字(哈希值),该数字是数组的索引,
示例,
请参见第一个链接:
来源:
http://www.relisoft.com/book/lang/pointer/8hash。 html
哈希表如何工作?
更新< /strong>
这是最好的来源:http://algs4.cs.princeton.edu/34hash/
Sample Explanation:
At the below source, basically it does two things:
1. Map Representation
Example:
NOTE: this is array of arrays, not two arrays (I can't see a possible generic hashmap, in a good way with just 2 arrays)
If you know Algorithms > Graph theory > Adjacency list, this looks exactly same.
2. Hash function
And the hash function converts string (input) to a number (hash value), which is index of an array
Example,
See at the first link:
Source:
http://www.relisoft.com/book/lang/pointer/8hash.html
How does a hash table work?
Update
This is the Best source: http://algs4.cs.princeton.edu/34hash/
你的意思是这样吗?
下面以Ruby的
irb
为例:You mean like this?
The following is using Ruby's
irb
as an illustration: