ASCII 字符上的哈希冲突测试
我目前正在为我们的一些后端系统构建缓存系统,这意味着我需要某种哈希表来表示缓存的实体。在这种情况下,我想知道是否有人知道任何显示不同算法以及引发冲突所需的最小 ASCII 字符串长度的测试? IE。使用一系列函数进行哈希处理的安全长度(ASCII 字符)是多少?
原因当然是我希望在大小(缓存将代表相对较小的服务器上的数百万个实体)、性能和碰撞安全性之间取得最佳权衡。
提前致谢, 缺口
I'm currently in the process of building a caching system for some of our back end systems, which means that I'll need a hash table of some sort, to represent cached entities. In this context, I was wondering if anyone knows about any tests showing different algorithms and the minimum ASCII string length necessary to provoke a collision? Ie. what's a safe length (ASCII characters) to hash with a range of functions?
The reason is of course that I want the best trade off between size (the cache is going to be representing several million entities on relatively small servers), performance and collision safety.
Thanks in advance,
Nick
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果你想要一个强大的哈希,我建议使用类似 Jenkins Hash 的东西。这应该不太可能产生冲突。就算法而言,您正在寻找的是雪崩测试
Bob Jenkins 的网站 有大量关于此类事情的方便信息。
至于哈希表的大小,我相信 Knuth 建议将其设置得足够大,以便在完美哈希的情况下,表的 2/3 将已满,而 Jenkins 建议使用最接近的 2 的更大幂
希望这会有所帮助!
If you want a strong hash, I'd suggest something like the Jenkins Hash. This should be less likely to generate clashes. In terms of algorithms, what you're looking for is an avalanche test
Bob Jenkins' Site has a whole lot of handy information on this sort of thing.
As for the size of the hash table, I believe Knuth recommends having it large enough so that with a perfect hash, 2/3 of the table would be full, while Jenkins recommends the nearest greater power of two
Hope this helps!