将整个哈希范围分割成相等的范围

发布于 2024-08-29 23:58:44 字数 254 浏览 8 评论 0原文

我正在寻找一个哈希范围(md5 或 sha1)并将其分成 n 个相等的范围。

例如,如果 m(节点数)= 5,则整个哈希范围将被分割为 5,以便键范围均匀分布。我希望 n=1 (节点 1)从哈希范围的开头到 1/5,2 从 1/5 到 2/5,等等一直到结束。

基本上,我需要将键范围映射到每个 n,这样当我散列一个值时,它就知道哪个 n 将处理该范围。

我对哈希很陌生,有点不确定我可以从哪里开始为项目解决这个问题。您能提供的任何帮助都会很棒。

I am looking to take a hash range (md5 or sha1) and split it into n equal ranges.

For example, if m (num nodes) = 5, the entire hash range would be split by 5 so that there would be a uniform distribution of key ranges. I would like n=1 (node 1) to be from the beginning of the hash range to 1/5, 2 from 1/5 to 2/5, etc all the way to the end.

Basically, I need to have key ranges mapped to each n such that when I hash a value, it knows which n is going to take care of that range.

I am new to hashing and a little bit unsure of where I could start on solving this for a project. Any help you could give would be great.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

江城子 2024-09-05 23:58:44

如果您希望将哈希值均匀地放入多个“桶”中,那么一些简单的数学运算就可以解决问题。注意舍入边缘情况...您最好使用 2 的幂作为 BUCKETS 值。

顺便说一下,这是Python代码,它支持大整数......

BUCKETS    = 5
BITS       = 160

BUCKETSIZE = 2**BITS / BUCKETS

int('ad01c5b3de58a02a42367e33f5bdb182d5e7e164', 16) / BUCKETSIZE == 3
int('553ae7da92f5505a92bbb8c9d47be76ab9f65bc2', 16) / BUCKETSIZE == 1
int('001c7c8c5ff152f1cc8ed30421e02a898cfcfb23', 16) / BUCKETSIZE == 0

If you are looking to place a hash value into a number of "buckets" evenly, then some simple math will do the trick. Watch out for rounding edge cases... You would be better to use a power of 2 for the BUCKETS value.

This is python code, by the way, which supports large integers...

BUCKETS    = 5
BITS       = 160

BUCKETSIZE = 2**BITS / BUCKETS

int('ad01c5b3de58a02a42367e33f5bdb182d5e7e164', 16) / BUCKETSIZE == 3
int('553ae7da92f5505a92bbb8c9d47be76ab9f65bc2', 16) / BUCKETSIZE == 1
int('001c7c8c5ff152f1cc8ed30421e02a898cfcfb23', 16) / BUCKETSIZE == 0
预谋 2024-09-05 23:58:44

如果你能忍受一点点很难消除偏差(任何2的幂都不可能被5整除,所以必须有一些偏差),然后取模(C和许多其他语言中的使用类似 C 的语法)是将整个范围划分为 5 个几乎相同大小的分区的方法。

任何带有 md5(m)%5==0 的消息 m 都位于第一个分区中,等等。

If you can stand a little very hard to remove bias (any power of two is impossible to divide evenly in 5, so there has to be some bias), then modulo (% in C and many other languages with C-like syntax) is the way to divide the full range into 5 almost identically-sized partitions.

Any message m with md5(m)%5==0 is in the first partition, etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文