当前位置：文江博客话题详情

创建一个包含两个数组的哈希表

发布于 2024-10-01 02:54:03 字数 142 浏览 6 评论 0原文

有谁知道如何做到这一点以及伪代码是什么样子？

众所周知，哈希表存储键、值对，当调用某个键时，该函数将返回与该键关联的值。我想做的是了解创建映射函数的底层结构。例如，如果我们生活在一个除了数组之外没有以前定义的函数的世界，我们如何复制我们今天拥有的哈希图？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

驱逐舰岛风号 2024-10-08 02:54:04

实际上，今天的一些 Hashmap 实现确实是由数组组成的，正如您所建议的那样。让我概述一下它是如何工作的：

哈希函数
哈希函数将您的键转换为第一个数组（数组 K）的索引。为此，可以使用 MD5 等哈希函数或更简单的哈希函数（通常包括模运算符）。

桶
一个简单的基于数组的 Hashmap 实现可以使用存储桶来应对冲突。数组 K 中的每个元素（“桶”）本身包含一个对的数组（数组 P）。添加或查询元素时，哈希函数会将您指向 K 中的正确存储桶，其中包含所需的数组 P。然后您迭代 P 中的元素，直到找到匹配的键，或者在P 结尾。

使用哈希将键映射到存储桶
您应该确保桶的数量（即 K 的大小）是 2 的幂，比方说 2^b。要找到某个键的正确存储桶索引，请计算 Hash(key) 但仅保留前 b 位。这是转换为整数时的索引。

重新缩放
计算密钥的哈希值并找到正确的存储桶非常快。但是，一旦桶变得更满，您将不得不迭代越来越多的项目才能找到正确的项目。因此，拥有足够的存储桶来正确分配对象非常重要，否则你的 Hashmap 会变得很慢。

因为您通常不知道要提前在 Hashmap 中存储多少对象，所以需要动态增大或缩小映射。您可以对存储的对象数量进行计数，一旦超过某个阈值，您就重新创建整个结构，但这次数组 K 的大小更大或更小。这样，K 中的一些存储桶就被保留了。 very full 现在会将它们的元素分配到几个桶中，这样性能会更好。

替代方案
您还可以使用二维数组来代替数组的数组，或者可以将数组 P 替换为链表。此外，您可以选择在其中一个存储桶包含超过某个配置数量的项目时重新创建（即重新缩放）哈希图，而不是保留存储对象的总数。

您所要求的内容的变体在哈希表维基百科条目中被描述为“数组哈希表”。

代码
有关代码示例，请查看此处。

希望这有帮助。

Actually, some of todays Hashmap implentations are indeed made out of arrays as you propose. Let me sketch how this works:

Hash Function
A hash function transforms your keys into an index for the first array (array K). A hash function such as MD5 or a simpler one, usually including a modulo operator, can be used for this.

Buckets
A simple array-based Hashmap implementation could use buckets to cope with collissions. Each element ('bucket') in array K contains itself an array (array P) of pairs. When adding or querying for an element, the hash function points you to the correct bucket in K, which contains your desired array P. You then iterate over the elements in P until you find a matching key, or you assign a new element at the end of P.

Mapping keys to buckets using the Hash
You should make sure that the number of buckets (i.e. the size of K) is a power of 2, let's say 2^b. To find the correct bucket index for some key, compute Hash(key) but only keep the first b bits. This is your index when cast to an integer.

Rescaling
Computing the hash of a key and finding the right bucket is very quick. But once a bucket becomes fuller, you will have to iterate more and more items before you get to the right one. So it is important to have enough buckets to properly distribute the objects, or your Hashmap will become slow.

Because you generally don't know how much objects you will want to store in the Hashmap in advance, it is desirable to dynamically grow or shrink the map. You can keep a count of the number of objects stored, and once it goes over a certain threshold you recreate the entire structure, but this time with a larger or smaller size for array K. In this way some of the buckets in K that were very full will now have their elements divided among several buckets, so that performance will be better.

Alternatives
You may also use a two-dimensional array instead of an array-of-arrays, or you may exchange array P for a linked list. Furthermore, instead of keeping a total count of stored objects, you may simply choose to recreate (i.e. rescale) the hashmap once one of the buckets contains more than some configured number of items.

A variation of what you are asking is described as 'array hash table' in the Hash table Wikipedia entry.

Code
For code samples, take a look here.

Hope this helps.

回复收藏 0 原文

习惯成性 2024-10-08 02:54:04

你能更精确一点吗？一个数组是否包含键，另一个数组包含值？

如果是这样，这里有一个 Java 示例（但这里有一些这种语言的特殊性）：

for (int i = 0; i < keysArray.length; i++) {
    map.put(keysArray[i], valuesArray[i]);
}

当然，您必须实例化您的 map 对象（如果您使用 Java，我建议使用HashMap 而不是过时的 HashTable），并测试您的数组以避免 null 对象并检查它们是否具有相同的尺寸。

Could you be more precise? Does one array contain the keys, the other one the values?

If so, here is an example in Java (but there are few specificities of this language here):

for (int i = 0; i < keysArray.length; i++) {
    map.put(keysArray[i], valuesArray[i]);
}

Of course, you will have to instantiate your map object (if you are using Java, I suggest to use a HashMap<Object, Object> instead of an obsolete HashTable), and also test your arrays in order to avoid null objects and check if they have the same size.

回复收藏 0 原文

一口甜 2024-10-08 02:54:04

示例说明：

在下面的源代码中，基本上它做了两件事：

1. 地图表示

一些（列表的 X 数量）列表
X 是 2 次方 N 数量的列表是不好的。 (2 的 N 次方)-1，或 (2 的 N 次方)+1，或质数都可以。

示例：

List myhashmap [hash_table_size];
// an array of (short) lists
// if its long lists, then there are more collisions

注意：这是数组的数组，而不是两个数组（我看不到可能的通用哈希图，只有 2 个数组就很好）

如果您知道算法 >图论>邻接表，这看起来完全一样。

2. 哈希函数

哈希函数将字符串（输入）转换为数字（哈希值），该数字是数组的索引，

将哈希值初始化为第一个字符（转换为 int 后），
对于每个后续字符，左移 4 位，然后添加 char （转换为 int 后）

示例，

int hash = input[0];
for (int i=1; i<input.length(); i++) {
    hash = (hash << 4) + input[i]
}

hash = hash % list.size()
// list.size() here represents 1st dimension of (list of lists)
//      that is 1st dimension size of our map representation from point #1
//      which is hash_table_size

请参见第一个链接：

int HTable::hash (char const * str) const

来源：
http://www.relisoft.com/book/lang/pointer/8hash。 html
哈希表如何工作？

更新< /strong>
这是最好的来源：http://algs4.cs.princeton.edu/34hash/

Sample Explanation:

At the below source, basically it does two things:

1. Map Representation

Some (X number of List) of lists
X being 2 power N number of lists is bad. A (2 power N)-1, or (2 power N)+1, or a prime number is good.

Example:

List myhashmap [hash_table_size];
// an array of (short) lists
// if its long lists, then there are more collisions

NOTE: this is array of arrays, not two arrays (I can't see a possible generic hashmap, in a good way with just 2 arrays)

If you know Algorithms > Graph theory > Adjacency list, this looks exactly same.

2. Hash function

And the hash function converts string (input) to a number (hash value), which is index of an array

initialize the hash value to first char (after converted to int)
for each further char, left shift 4 bits, then add char (after converted to int)

Example,

int hash = input[0];
for (int i=1; i<input.length(); i++) {
    hash = (hash << 4) + input[i]
}

hash = hash % list.size()
// list.size() here represents 1st dimension of (list of lists)
//      that is 1st dimension size of our map representation from point #1
//      which is hash_table_size

See at the first link:

int HTable::hash (char const * str) const

Source:
http://www.relisoft.com/book/lang/pointer/8hash.html
How does a hash table work?

Update
This is the Best source: http://algs4.cs.princeton.edu/34hash/

回复收藏 0 原文

我的鱼塘能养鲲 2024-10-08 02:54:04

你的意思是这样吗？

下面以Ruby的irb为例：

 cities = ["LA", "SF", "NY"]
 => ["LA", "SF", "NY"] 

 items = ["Big Mac", "Hot Fudge Sundae"]
 => ["Big Mac", "Hot Fudge Sundae"] 

 price = {}
 => {} 

 price[[cities[0], items[1]]] = 1.29
 => 1.29 

 price
 => {["LA", "Hot Fudge Sundae"]=>1.29} 

 price[[cities[0], items[0]]] = 2.49
 => 2.49 

 price[[cities[1], items[0]]] = 2.99
 => 2.99 

 price
 => {["LA", "Hot Fudge Sundae"]=>1.29, ["LA", "Big Mac"]=>2.49, ["SF", "Big Mac"]=>2.99} 

 price[["LA", "Big Mac"]]
 => 2.49

You mean like this?

The following is using Ruby's irb as an illustration:

 cities = ["LA", "SF", "NY"]
 => ["LA", "SF", "NY"] 

 items = ["Big Mac", "Hot Fudge Sundae"]
 => ["Big Mac", "Hot Fudge Sundae"] 

 price = {}
 => {} 

 price[[cities[0], items[1]]] = 1.29
 => 1.29 

 price
 => {["LA", "Hot Fudge Sundae"]=>1.29} 

 price[[cities[0], items[0]]] = 2.49
 => 2.49 

 price[[cities[1], items[0]]] = 2.99
 => 2.99 

 price
 => {["LA", "Hot Fudge Sundae"]=>1.29, ["LA", "Big Mac"]=>2.49, ["SF", "Big Mac"]=>2.99} 

 price[["LA", "Big Mac"]]
 => 2.49

回复收藏 0 原文

~没有更多了~