有哪些算法可用于调整哈希表的大小?

发布于 2024-08-20 02:33:01 字数 78 浏览 8 评论 0原文

我已经用 C 实现了自己的哈希表函数,但目前它不支持调整大小。我想知道除了创建新的空哈希表并将所有内容移至那里的强力方式之外,还存在哪些算法?

I have implemented my own hash table functions in C, but currently it doesn't support resizing. I was wondering what algorithms do exist apart from the brute-force way of creating a new empty hash table and moving everything there?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

花辞树 2024-08-27 02:33:01

有增量调整大小。

来自维基百科:

增量调整大小

一些哈希表的实现,
特别是在实时系统中,不能
付出扩大哈希值的代价
一次全部表,因为它可能
中断时间关键的操作。如果
人们无法避免动态调整大小,
解决方案是执行调整大小
逐渐:

在调整大小期间,分配新的
哈希表,但保留旧表
不变。
在每个查找或删除操作中,检查两个表。
仅在新表中执行插入操作。
每次插入时还将 r 个元素从旧表移动到新表
桌子。
当所有元素从旧表中删除后,释放它。

确保旧表将被
完全复制到新的之前
表本身需要放大,它
有必要增加尺寸
该表的系数至少为 (r +
1)/r 在调整大小期间。

因此,这并不是一种将所有元素从旧表移动到新表的聪明方法(如果有的话,我还没有看到);相反,它通过允许迁移逐渐进行来减轻调整大小的负担。

There is incremental resizing.

From Wikipedia:

Incremental resizing

Some hash table implementations,
notably in real-time systems, cannot
pay the price of enlarging the hash
table all at once, because it may
interrupt time-critical operations. If
one cannot avoid dynamic resizing, a
solution is to perform the resizing
gradually:

During the resize, allocate the new
hash table, but keep the old table
unchanged.
In each lookup or delete operation, check both tables.
Perform insertion operations only in the new table.
At each insertion also move r elements from the old table to the new
table.
When all elements are removed from the old table, deallocate it.

To ensure that the old table will be
completely copied over before the new
table itself needs to be enlarged, it
is necessary to increase the size of
the table by a factor of at least (r +
1)/r during the resizing.

So this is not some clever way of moving all of the elements from the old table into the new table (and if there is one, I haven't seen it); rather, it eases the burden of resizing by allowing the migration to happen gradually.

青芜 2024-08-27 02:33:01

维基百科有一些关于这个主题的智慧之言

另外,它不是一个解决方案,但可能是解决方案的一部分 - 如果您在 Windows 下,您可能会使用 VirtualAlloc 系列函数,它允许您保留地址空间而无需实际提交内存页。也就是说,用外行人的话来说,您可以执行类似“malloc”的操作,并告诉它“保留 1000MB,但只使前 10 个可用”。因此,如果写入超过 10MB,就会出现常见的崩溃情况。但当需要扩展时,您只需说“好吧,在第一个之后再给我 10MB”。下一个 10MB 直接位于第一个 10MB 之后的地址。这就像调整数组的大小。实际使用的 RAM 将仅满足您的需要,但内存地址将提前保留,以便其他内存分配操作不会使用它们。

Wikipedia has some words of wisdom on the subject.

Also, it's not a solution, but could be a part of one - if you're under windows you might use the VirtualAlloc family of functions which allow you to reserve address space without actually committing memory pages. That is, in laymans terms, you would do something like a "malloc" and tell it to "reserve 1000MB, but only make the first 10 available". So if you write past the 10MB, you'd get the usual crash. But when the time comes to expand, you just say "OK, give me another 10MB after the first ones". And the next 10MB is made available at the address directly after the first 10MB. It's like resizing an array. The actual RAM in use will be only as much as you need, but the memory addresses will be reserved in advance so that other memory allocation operations don't use them.

贪了杯 2024-08-27 02:33:01

通常的逃避方法是让客户端代码预先猜测最佳的存储桶数量。这是有用的,客户通常可以合理猜测表中最终会出现多少元素。如果您想自动执行此操作,那么您首先必须声明存储桶大小的素数数组。当您看到存储桶的负载因子变得过高时,请选择数组中的下一个素数,重新创建存储桶列表并将元素从旧存储桶移动到新表。

The usual cop-out is to leave it up to the client code to guess the best number of buckets up front. That's serviceable, the client usually has a reasonable guess as to how many elements will end up in the table. If you want to do it automatically then you first have to declare an array of primes for bucket sizes. When you see the load factor of a bucket getting too high, pick the next prime in the array, recreate the bucket list and move the elements from the old buckets to the new table.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文