.NET:64 位哈希码
我需要字符串的 64 位哈希值,默认的 .GetHashCode() 仅返回 32 位 int。我可以生成 MD5/SHA1 哈希值,并仅使用前 64 位。但由于这些算法具有加密安全性,因此对 CPU 的要求更高。
是否可以如此简单,只需在输入字符串的反面再次调用 .GetHashCode() 即可?然后将两个 32 位 int 转换为 64 位 long?它是否具有与 CRC64 等“真实”64 位哈希相同的扩展和抗冲突能力?
I need a 64-bit hash for strings, and the default .GetHashCode() returns only a 32-bit int. I could generate a MD5/SHA1 hash, and use only the first 64bits. But because those algorithms are cryptographically secure, they are much more demanding on the CPU.
Could it be so simple as to just calling .GetHashCode() a second time, on the reverse of the input string? And casting the two 32bit int into a 64-bit long? Would it have the same spread and collision resistance as a 'real' 64bit hash like CRC64?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你即将犯一个很大的错误。 64 位哈希不足以保证唯一性。这至少需要 128 位。指南是常见的选择。
生成唯一 32 位或 64 位数字并不难,您只需使用下一个即可。问题是你需要知道前一个。数据库引擎从来没有遇到过这个问题,它们的目的是记住东西。
使用自动增量列。
You are about to make a very big mistake. A 64-bit hash isn't nearly good enough to guarantee uniqueness. That requires at least 128 bits. A guid is a common choice.
Generating unique 32-bit or 64-bit numbers isn't that hard, you simple use the next one. The rub is that you need to know the previous one. Dbase engines never have a problem with that, their point of being is remembering stuff.
Use an auto-increment column.
为了解决这个问题,您知道
GetHashCode()
不会生成任何独特的东西,对吗?两个完全不同的字符串可以返回相同的哈希码。该算法仅用于在哈希表中创建均匀分布的对象。来自马口:
此外,调用
GetHashCode()
时发生的情况的规则可能并且将会随着时间的推移而改变。请参阅标题为“规则:GetHashCode 的使用者不能依赖它随着时间的推移或跨应用程序域保持稳定”的部分 此处,具体来说:要查看某人的碰撞检测工作,请查看此内容。
Just to get this out of the way, you know that
GetHashCode()
doesn't generate anything unique, right? Two completely different strings can return the same hash code. The algorithm is only intended for creating even distribution of objects in hashtable.From the horse's mouth:
Additionally, the rules for what happens when you call
GetHashCode()
can and will change over time. See the section titled "Rule: Consumers of GetHashCode cannot rely upon it being stable over time or across appdomains" here, specifically:To see someone's collision detection work check this out.
您选择 64 位有什么特殊原因吗? MD5更多地用于检查内容是否未被意外更改,而SHA更多地用于确保内容未被故意更改。我肯定会使用 LEAST SHA1。
Is there a particular reason you chose 64 bit? MD5 is more for checking that the content hasn't changed on accident, and SHA is more for making sure the content wasn't changed on purpose. I'd definitely use at LEAST SHA1.