为什么Hashtable的负载因子与CLRS书中描述的不一致?
从 Java 关于 Hashtable 类的文档中,它说
作为一般规则,默认负载系数 (.75) 在时间和空间成本之间提供了良好的权衡
因此 Hashtable 的负载因子为 0.75,这意味着如果有 N 个键,Hashtable 将使用 M = N/0.75 空间来存储他们。
在CLRS书中,还介绍了负载因子alpha。
但据我了解,CLRS有意将alpha设置为大于1,即M = N/alpha < N。这意味着哈希表可以使用 M 个槽,其中 M < N 这样就可以节省未使用的密钥的存储空间。
我说M < N 可以节省存储空间,因为通常我们不知道 N 的确切值,但我们知道键的集合并使用 N 代表可能的键数。按键的集合可能很大,但实际使用的按键数量却很少。因此设置M小于N可以节省存储空间。这也是为什么 Hashtable 通常不使用直接数组来映射每个 {key, value} 1:1 的原因。
但是Java中的Hashtable使用的存储超过了N。我认为这与CLRS的设计不符,对吗?
我说得对吗?
谢谢
From the doc of Java about Hashtable class, it says
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs
So the load factor for Hashtable is 0.75, which means if there are N keys, Hashtable will use M = N/0.75 spaces to store them.
In CLRS book, it also introduces load factor alpha.
But from my understanding, CLRS intends to set alpha larger than 1, i.e., M = N/alpha < N. This means a Hashtable can use M slots where M < N so that it can save storage from unused keys.
I say M < N can save storage because normally we don't know exactly value for N, but we know the set of the keys and use N to stand for possible number of keys. The set of the keys may be very big, but the number of actual keys in use is very small. So setting M smaller than N can save the storage. Also this is why normally Hashtable does not use a direct array to map every {key, value} 1:1.
But Hashtable in Java use storage more than N. I think it is not consistent with the CLRS design, right?
Am I right?
thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
那么,负载因子应该大于添加的元素。除以小于 1 的数,结果会比最初的数大。
假设您要添加 100 个元素,您可以编写以下任一内容:
或
两者的结果均为
133.333333
->133
。整个 JavaDoc:
Well, the load factor should be larger than the elements added. Division by a number less than one results in a larger number than the initial one.
Assuming you want to add 100 elements you can write either:
or
where both results in
133.333333
->133
.The whole JavaDoc:
我没有听说过 CLRS 这本书,但我可以告诉您,使用大于 0.75 左右的负载因子(对于某些哈希映射设计甚至更小)会导致大量冲突。
如果允许 HashMap 或 Hashtable 自然增长,其容量将与条目数成正比。与 Entry 对象的大小(通常为 16 - 24 字节)相比,这些引用很小(通常为 4 字节),因此您关心的哈希索引表将始终比 Entry 对象的大小小几倍,更不用说键了和价值观。
I have not heard of the CLRS book, but I can tell you using a load factor of more then about 0.75 (even less for some hash map designs) leads to a significant number of collisions.
If you allow the HashMap or Hashtable to grow naturally, its capacity will be proportional to the number of entries. These references are small (typically 4 bytes) compared to the size of Entry objects (typically 16 - 24 bytes) so the hash index table you are concerned about will always be several times smaller than the size of the Entry objects, let alone the keys and values.