您能否向我展示两个实际的、重要的字符串,它们产生相同的 MD5 或 SHA1 哈希值?
...如果没有,为什么不呢?
所以这就是问题背后的问题。
据我所知,MD5 和 SHA1 中发生意外冲突的可能性很小(尽管 SHA1 中的可能性比 MD5 中的可能性小)。我也明白故意碰撞在理论上是可能的。这实际上可能吗?我可以通过一些过程在这些算法中故意生成具有相同哈希值的两条消息吗?我要经历什么流程?
...and if not, why not?
So here's the question behind the question.
I understand that the likelihood of accidental collisions in MD5 and SHA1 is small (though less likely in SHA1 than in MD5). I also understand that deliberate collisions are theoretically possible. Is it practically possible? Could I go through some process to deliberately generate two messages with the same hash, in either of these algorithms? What process would I go through?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从数学意义上来说,给定的哈希函数必然存在冲突:可能的输入多于可能的输出,因此必须有两个输入映射到相同的输出。现在证明碰撞的存在和实际找到碰撞是两件不同的事情。如果我把一颗钻石扔到海洋中央,我肯定知道海洋中的某个地方现在有一颗钻石 - 但如果我想恢复它,我会很茫然。
对于输出为 n 位的“通用”哈希函数,有通用方法来查找冲突,平均成本为 2n/2 函数的评估(参见此页面)。根据n,这可能很容易,也可能完全不可行。 MD5 的输出为 128 位,264 是“相当高”:你可以做到,但需要数千台机器和数月的计算。
现在MD5有一些已知的弱点,即一些内部结构可以被利用来更容易地产生冲突。迄今为止已知的对 MD5 的最佳攻击需要略小于221 函数调用,因此在基本 PC 上这只需几秒钟(最多)。 @Omri 在他的回应中指出了 MD5 冲突的一个很好的例子,其中冲突的消息实际上是具有广泛不同行为的可执行文件。
对于 SHA-1,输出大小为 160 位。这意味着一般碰撞攻击的成本约为 280,这是现有技术无法实现的(嗯,人类可以做到这一点,但是当然不是谨慎地:这应该可以用相当于整个美国陆军一年的预算来实现)。然而,SHA-1 与 MD5 一样,也存在已知的弱点。目前,这些弱点仍然是理论上的,因为它们会导致成本 261 的碰撞攻击,这对于任何单个加密研究实验室来说都太昂贵了,因此尚未完全实施(有一次已宣布的攻击,成本为 251,但似乎是一次失败 - 分析存在缺陷)。因此,没有实际的碰撞可以显示(但研究人员非常确定 261 攻击是正确的,并且如果有人找到预算的话,它会起作用)。
对于 SHA-256,不存在已知的弱点,并且 256 位输出大小意味着一般成本为 2128,这对于当今和未来的技术而言是无法挽回的。
Collisions necessarily exist for a given hash function, in a mathematical sense: there are more possible inputs than possible outputs, so there must be two inputs which map to the same output. Now proving the existence of a collision, and actually finding one, are two different things. If I drop a diamond in the middle of the ocean, I positively know that there is now a diamond somewhere in the ocean -- but I am quite at a loss if I want to recover it.
For a "generic" hash function with an output of n bits, there are generic methods to find a collision, with average cost 2n/2 evaluations of the function (see this page). Depending on n, this can range from the easy to the totally unfeasible. MD5 has an output of 128 bits, and 264 is "quite high": you can do it, but it will require a few thousands of machines and months of computations.
Now there are known weaknesses in MD5, i.e. some internal structure which can be exploited to produce collisions much more easily. Best attack on MD5 known so far requires a bit less than 221 function invocations, so this is a matter of a few seconds (at most) on a basic PC. @Omri points in his response to a great example of an MD5 collision, in which the colliding messages are actually executable files with widely different behaviors.
For SHA-1, the output has size 160 bits. This means that a generic collision attack has cost about 280, which is not attainable with existing technology (well, Mankind could do it, but certainly not discreetly: it should be doable with, say, the equivalent of one year of budget for the whole US Army). However, SHA-1, like MD5, has known weaknesses. Right now, these weaknesses are still theoretical, in that they lead to a collision attack with cost 261, which is too expensive for any single crypto research lab, and thus has not been fully conducted yet (there was an announced attack with cost 251 but it seems that it was a dud -- the analysis was flawed). So no actual collision to show (but researchers are pretty sure that the 261 attack is correct and would work, if someone found the budget).
With SHA-256, there is no known weakness, and the 256-bit output size implies a generic cost of 2128, far away into the undoable with today's and tomorrow's technology.