sha-1 是否会对小于 160 位的输入消息产生冲突?
我有一个 128 位 ID,我想对其执行单向哈希,但我不想为输入消息获得相同的摘要。有谁知道 sha-1 或替代方案是否保证不会对小于其输出摘要大小的消息集产生冲突?这至少在理论上是可能的......
我也考虑过使用RSA,并丢弃私钥来给我进行单向加密,但我需要将结果存储在32个字符的数据库字段中,以及我可用的加密方案不要生产足够小的东西。
欢迎任何关于产生原始值的确定性、不可逆和无碰撞变换的另一种方式的建议。
I have a 128bit ID that I want to perform a one way hash on, but I don't want to ever get the same digest for an input message. Does anyone know if sha-1, or an alternative, is guaranteed not to produce collisions for the set of messages less than its output digest size? This is at least theoretically possible...
I also considered using RSA, and discarding the private key to give me a one-way encrypt, but I need to store the result in a 32 char DB field, and the encryption schemes available to me don't produce anything small enough.
Any suggestions of another way of producing a deterministic, non-reversable and collision free transform of the original value are welcome.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
加密哈希给出了给定输入的随机数的非常好的近似值。那么,在获得相同的 160 位之前,一个房间需要多少个随机哈希值?关于平方根(免责声明:我不是统计学家)。因此,您应该会在 80 位左右看到冲突。
我想实用性意味着你应该知道宇宙射线何时会成为比碰撞更大的问题。
Cryptographic hashes give a very good approximation of a random number for a given input. So how many random hashes do you need in a room until you get the same 160 bits? It about the square root (disclaimer: I am not a statistician). So you should expect to see clashes at around 80-bits.
I guess practicalities mean you should know when cosmic rays will be a bigger problem than collisions.
如果要计算从 128 位到 128 位的秘密排列,一种简单的解决方案是使用 128 位分组密码(例如 AES)以及固定但秘密的密钥。当然,您必须能够永远保守密钥的秘密,否则您将一无所有。
If you want to compute a secret permutation from 128 bits to 128 bits, one simple solution is to use a 128-bit block cipher like AES with a fixed but secret key. You must, of course, be able to keep the key secret forever or you've got nothing.
您的 ID 是唯一的,并且是 128 位。
您的评论表明您无法按原样使用该 ID。
您需要它是唯一的,而不仅仅是可能唯一。因此,您不能使用哈希。
您不能同时拥有两个世界 - 您不能拥有不可逆的 1:1 映射。这是不可能的。
加密 - 一种双射操作,因此不会发生冲突 - 带有密钥的 ID 将使反转 ID 以确定其原始值非常困难。
AES 具有 128 位的良好块长度,它将根据 128 位输入生成 128 位输出,比旧算法更快 (!),并且广泛适用于大多数平台和语言。我建议您使用 AES 来达到您的目的。
Your ID is unique and 128 bits.
Your comments explain that you cannot use the ID as-is.
You need it to be unique, not just probably unique. Therefore, you cannot use a hash.
You cannot have both worlds - you cannot have a 1:1 mapping that is not reversible. Its an impossibility.
Encrypting - a bijective operation, so there'll be no collisions - IDs with a secret key will make reversing the ID to determine its original value very hard.
AES has a nice block length of 128 bits which will generate 128 bits output from your 128 bits of input, is faster than old algorithms (!) and is widely available for most platforms and languages. I suggest you use AES for your purpose.
哈希函数被设计为不产生冲突,但没有什么是“保证的”。相反,可以保证会发生冲突,因为消息空间实际上是无限的,而可能的哈希值数量有限。
然而,SHA-1 已被证明抗碰撞,这是您所能期望的最好结果。
Hash functions were designed not to produce collisions, but nothing is "guaranteed." On the contrary, it is guaranteed that there WILL be collisions, because the message space is practically indefinite, while you have a limited number of possible hashes.
SHA-1 however has been proven to be collision-resistant, and that's the best you can hope for.
我不知道什么哈希函数可以避免冲突,但是,如果您在这里找不到答案,一个好的起点可能是 完美哈希函数。从该页面:
该页面上有许多指向更多信息的链接,您可能会觉得有用。
话虽这么说,你能说出为什么你需要一个完美的 has 函数吗?可能有其他方法可以完成您想要的事情而不需要该属性,这里有人可能可以提出建议。
I don't know what hash functions avoid collisions but, if you can't find the answer here, a good starting point might be Perfect Hash Function on Wikipedia. From that page:
There's a number of links to more information on that page that you may find useful.
That being said, can you say why you need a perfect has function? It may be that there are other ways to accomplish what you want to without needing that property, and someone here may be able to make a suggestion.
对于足够大的位大小,我认为离散求幂是一个 1:1 函数,但反转在计算上是不可行的。我不确定需要多大“大”。极其缓慢(但概念上可以理解)实现的代码:
该程序将花费数十亿步来计算许多 dat 值的哈希值,但使用更复杂的代码,可以在合理的时间内完成这项工作。当然,32 位哈希在加密方面毫无价值,但该方法可以轻松扩展到任何大小。
For sufficiently large bit sizes, I think discrete exponentiation is a 1:1 function but reversal is computationally infeasible. I'm not sure how "large" is required though. Code for an unusably slow (but conceptually understandable) implementation:
This program would take billions of steps to compute the hash for many values of dat, but with trickier code the job can be done in reasonable time. A 32-bit hash is cryptographically worthless, of course, but the approach can be readily extended to any size.
散列“不太可能”产生任何重复项,但不能保证。上
另一方面,任何对称加密方案都会为 128 位输入生成 128 位输出,
并保证不重复。
另一方面,如果您依赖于哈希值是唯一的,我的直觉是您
做错事。例如,如果您使用哈希来混淆密码,那么您
必须小心,不要将散列密码设置为事实上的密码。
Hashing is "unlikely" to produce any duplicates, but there are no guarantees. On the
other hand, any symmetric encryption scheme will produce 128 bits out for 128 bits in,
and guarantee no duplicates.
On the other hand, if you are depending on the hashes being unique, my intuition is you're
doing something wrong. If you're using hashes to obfuscate passwords for example, you
have to be careful that you don't make the hashed password the de facto password.
单向哈希的要点难道不是您无法(通常)从哈希值恢复原始数据吗?那么为什么设计哈希函数的人会不遗余力地防止小输入的冲突呢?
相反,您似乎想要模糊数据,这对于某些目的来说是很好的。如果使用公钥加密不切实际,您不妨编写自己的函数。
Isn't the point of a one-way hash that you can't (in general) recover the original data from the hashed value? So why would someone designing a hash function go out of their way to prevent collisions for small inputs?
Instead, it looks like you want to obscure the data, which is fine for some purposes. If it's not practical to use public key encryption, you might as well write your own function.
根据这篇文章 http://www.debian-administration.org/users/ dkg/weblog/48,
然而,据我所知还没有发现任何碰撞,这意味着,即使人们非常努力也没有发现任何碰撞。因此,假设没有发生冲突似乎是合理的(如果您不处理高安全性数据)
According to this article http://www.debian-administration.org/users/dkg/weblog/48,
However, as far as I know no collision has been found yet , which means , even people which have tried really hard haven't found any collisions. So it seems reasonable to assume that no collision happen (if you are not dealing with high-security data)