什么时候使用损坏的哈希函数是安全的?

发布于 2024-09-02 18:00:20 字数 1476 浏览 4 评论 0原文

使用像 SHA-256 这样的安全哈希函数是微不足道的,继续使用 MD5 来保证安全是鲁莽的行为。然而,我想更好地理解哈希函数漏洞的一些复杂性。

冲突已针对 MD4 和 MD5 生成。根据 NIST 的说法,MD5 不是安全哈希函数。只需239 操作即可生成碰撞绝对不应该用于密码。然而,SHA-1 容易受到类似的碰撞攻击,其中可以发现碰撞需要 269 次操作,而暴力破解则需要 280。没有人产生 SHA-1 冲突,NIST 仍将 SHA-1 列为安全消息摘要功能

那么什么时候使用损坏的哈希函数是安全的呢?即使某个功能被破坏,它仍然可以“足够大”。 根据 Schneier,容易受到冲突攻击的哈希函数可以仍可用作 HMAC。我相信这是因为 HMAC 的安全性依赖于其秘密密钥,并且在获得该密钥之前无法发现冲突。一旦你在 HMAC 中使用了密钥,它就已经被破坏了,所以这是一个没有实际意义的问题。哪些哈希函数漏洞会破坏 HMAC 的安全性?

让我们进一步了解这个属性。如果在密码前面添加盐,那么使用 MD4 等非常弱的消息摘要作为密码是否会变得安全?请记住,MD4 和 MD5 攻击是前缀攻击,如果在前面添加盐,则攻击者无法控制消息的前缀。如果盐确实是一个秘密,并且攻击者不知道,那么将其附加到密码中是否重要?可以安全地假设攻击者在获得整个消息之前无法生成冲突吗?

您是否知道可以在安全上下文中使用损坏的哈希函数而不引入漏洞的其他情况?

(请张贴支持证据,因为它太棒了!)

It is trivial to use a secure hash function like SHA-256, and continuing to use MD5 for security is reckless behavior. However, there are some complexities to hash function vulnerabilities that I would like to better understand.

Collisions have been generated for MD4 and MD5. According to NIST, MD5 is not a secure hash function. It only takes 239 operations to generate a collision and should never be used for passwords. However SHA-1 is vulnerable to a similar collision attack in which a collision can be found in 269 operations, whereas brute force is 280. No one has generated a SHA-1 collision and NIST still lists SHA-1 as a secure message digest function.

So when is it safe to use a broken hash function? Even though a function is broken it can still be "big enough". According to Schneier a hash function vulnerable to a collision attack can still be used as an HMAC. I believe this is because the security of an HMAC is dependent on its secret key and a collision cannot be found until this key is obtained. Once you have the key used in an HMAC it's already broken, so it's a moot point. What hash function vulnerabilities would undermine the security of an HMAC?

Let's take this property a bit further. Does it then become safe to use a very weak message digest like MD4 for passwords if a salt is prepended to the password? Keep in mind the MD4 and MD5 attacks are prefixing attacks, and if a salt is prepended then an attacker cannot control the prefix of the message. If the salt is truly a secret, and isn't known to the attacker, then does it matter if it's appended to the password? Is it safe to assume that an attacker cannot generate a collision until the entire message has been obtained?

Do you know of other cases where a broken hash function can be used in a security context without introducing a vulnerability?

(Please post supporting evidence because it is awesome!)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

一笑百媚生 2024-09-09 18:00:20

实际上,冲突比您在 MD5 和 SHA-1 上列出的更容易。 MD5 冲突的发现时间相当于 226.5 操作(其中一个“操作”是通过短消息计算 MD5)。请参阅此页面了解一些详细信息和攻击的实现(我编写了该代码;它在 64 位模式下的 2.4 GHz Core2 x86 上平均 14 秒内发现了冲突)。

同样,最著名的 SHA-1 攻击发生在大约 261 操作中,而不是 269 操作中。这仍然是理论上的(尚未产生实际碰撞),但它在可行的范围内。

至于对安全性的影响:哈希函数通常被认为具有三个属性:

  • 无原像:给定 y,找到 x 使得 h 不应该是可行的(x) = y
  • 没有第二个原像:给定 x1,找到 x2 应该是不可行的(与 x1 不同) >x1) 使得 h(x1) = h(x2)
  • 无碰撞:应该不可能找到任何 x1x2(彼此不同)这样 h(x1) = h(x2)

对于具有 n 位输出的哈希函数,存在 2n 中的通用攻击(无论哈希函数的详细信息如何,都会起作用) > 操作用于前两个属性,2n/2 操作用于第三个属性。如果对于给定的散列函数,发现了一种攻击,该攻击通过利用散列函数如何操作的特殊细节,比相应的通用攻击更快地找到原像、第二原像或碰撞,则该散列函数被称为被“打破”。

然而,并非哈希函数的所有用法都依赖于所有三个属性。例如,数字签名首先对要签名的数据进行哈希处理,然后在算法的其余部分中使用哈希值。这依赖于对原像和第二原像的抵抗力,但数字签名本身并不受冲突的影响。在某些特定的签名场景中,冲突可能是一个问题,其中攻击者可以选择要由受害者签名的数据(基本上,攻击者计算冲突,让受害者签署一条消息,并且签名在以下时间内有效)另一条消息也是如此)。这可以通过在计算签名之前在签名消息前添加一些随机字节来抵消(攻击和解决方案在 X.509 证书的上下文中进行了演示)。

HMAC 安全性依赖于哈希函数必须满足的其他属性;也就是说,“压缩函数”(构建哈希函数的基本块)充当伪随机函数 (PRF)。关于 PRF 的详细信息非常技术性,但粗略地说,PRF 应该与随机预言机没有什么区别。随机预言被建模为一个黑匣子,其中包含一个侏儒、一些骰子和一本大书。在某些输入数据上,侏儒选择一个随机输出(用骰子)并在书中记下输入消息和随机选择的输出。侏儒使用这本书来检查他是否已经看到相同的输入消息:如果是,则侏儒返回与之前相同的输出。通过构造,在尝试之前,您对给定消息的随机预言的输出一无所知。

随机预言模型允许在 PRF 调用中量化 HMAC 安全证明。基本上,证明表明,如果不多次调用 PRF,HMAC 就无法被破解,而“巨大”是指在计算上不可行。

不幸的是,我们没有随机预言,所以在实践中我们必须使用哈希函数。没有证据证明散列函数确实存在,具有 PRF 属性;现在,我们只有候选函数,即我们(尚)无法证明其压缩函数不是 PRF 的函数。

如果压缩函数是PRF那么散列函数会自动抵抗冲突。这就是 PRF 魔力的一部分。 因此,如果我们可以找到哈希函数的冲突,那么我们就知道内部压缩函数不是 PRF。这不会将冲突转变为对 HMAC 的攻击。能够随意产生冲突无助于破坏 HMAC。然而,这些冲突表明与 HMAC 相关的安全证明并不适用。保证无效。这和笔记本电脑一样:打开机箱并不一定会损坏机器,但之后你就得靠自己了。

Kim-Biryukov-Preneel-Hong 文章中,介绍了一些针对 HMAC 的攻击,特别是针对 HMAC-MD4 的伪造攻击。该攻击利用了 MD4 的缺点(它的“弱点”),使其成为非 PRF。相同弱点的变体被用来在 MD4 上生成冲突(MD4 已被彻底破坏;某些攻击生成冲突的速度比哈希函数本身的计算速度还要快!)。因此,这些冲突并不意味着 HMAC 攻击,但两种攻击都来自同一源。但请注意,伪造攻击的成本258,相当高(没有产生实际的伪造,结果仍然是理论上的),但远低于抵抗力HMAC 的预期级别(具有具有 n 位输出的强大哈希函数,HMAC 应抵抗高达 2n 工作因子;n = 128)。

因此,虽然冲突本身并不意味着 HMAC 存在缺陷,但它们是个坏消息。在实践中,碰撞对于极少数设置来说是一个问题。但是,了解冲突是否会影响哈希函数的给定使用已经够棘手的了,继续使用已经证明存在冲突的哈希函数是非常不明智的。

对于 SHA-1,攻击仍处于理论阶段,SHA-1 已得到广泛部署。这种情况是这样描述的:“警报已响,但没有明显的火灾或烟雾。是时候朝出口走去——但不要逃跑。”

有关该主题的更多信息,请首先阅读应用密码学手册,Menezes、van Oorschot 和 Vanstone 所著,这是密码学家学徒的必读之作(不要与 B. Schneier 的《应用密码学》混淆,这是一篇写得很好的介绍,但没有《手册》那么详尽) 。

Actually collisions are easier than what you list on both MD5 and SHA-1. MD5 collisions can be found in time equivalent to 226.5 operation (where one "operation" is the computation of MD5 over a short message). See this page for some details and an implementation of the attack (I wrote that code; it finds a collision within an average of 14 seconds on a 2.4 GHz Core2 x86 in 64-bit mode).

Similarly, the best known attack on SHA-1 is in about 261 operations, not 269. It is still theoretical (no actual collision was produced yet) but it is within the realm of the feasible.

As for implications on security: hash functions are usually said to have three properties:

  • No preimage: given y, it should not be feasible to find x such that h(x) = y.
  • No second preimage: given x1, it should not be feasible to find x2 (distinct from x1) such that h(x1) = h(x2).
  • No collision: it should not be feasible to find any x1 and x2 (distinct from each other) such that h(x1) = h(x2).

For a hash function with a n-bit output, there are generic attacks (which work regardless of the details of the hash function) in 2n operations for the two first properties, and 2n/2 operations for the third. If, for a given hash function, an attack is found, which, by exploiting special details of how the hash function operates, finds a preimage, a second preimage or a collision faster than the corresponding generic attack, then the hash function is said to be "broken".

However, not all usages of hash functions rely on all three properties. For instance, digital signatures begin by hashing the data which is to be signed, and then the hash value is used in the rest of the algorithm. This relies on the resistance to preimages and second preimages, but digital signatures are not, per se, impacted by collisions. Collisions may be a problem in some specific signature scenarios, where the attacker gets to choose the data that is to be signed by the victim (basically, the attacker computes a collision, has one message signed by the victim, and the signature becomes valid for the other message as well). This can be counteracted by prepending some random bytes to the signed message before computing the signature (the attack and the solution where demonstrated in the context of X.509 certificates).

HMAC security relies on an other property that the hash function must fulfill; namely, that the "compression function" (the elementary brick on which the hash function is built) acts as a Pseudo-Random Function (PRF). Details on what a PRF is are quite technical, but, roughly speaking, a PRF should be indistinguishable from a Random Oracle. A random oracle is modeled as a black box which contains a gnome, some dice and a big book. On some input data, the gnome select a random output (with the dice) and writes down in the book the input message and the output which was randomly selected. The gnome uses the book to check whether he already saw the same input message: if so, then the gnome returns the same output than previously. By construction, you can know nothing about the output of a random oracle on a given message until you try it.

The random oracle model allows the HMAC security proof to be quantified in invocations of the PRF. Basically, the proof states that HMAC cannot be broken without invoking the PRF a huge number of times, and by "huge" I mean computationally infeasible.

Unfortunately, we do not have random oracles, so in practice we must use hash functions. There is no proof that hash functions really exist, with the PRF property; right now, we only have candidates, i.e. functions for which we cannot prove (yet) that their compression functions are not PRF.

If the compression function is a PRF then the hash function is automatically resistant to collisions. That's part of the magic of PRF. Therefore, if we can find collisions for a hash function, then we know that the internal compression function is not a PRF. This does not turn the collisions into an attack on HMAC. Being able to generate collisions at will does not help in breaking HMAC. However, those collisions demonstrate that the security proof associated with HMAC does not apply. The guarantee is void. That's just the same than a laptop computer: opening the case does not necessarily break the machine, but afterwards you are on your own.

In the Kim-Biryukov-Preneel-Hong article, some attacks on HMAC are presented, in particular a forgery attack on HMAC-MD4. The attack exploits the shortcomings of MD4 (its "weaknesses") which make it a non-PRF. Variants of the same weaknesses were used to generate collisions on MD4 (MD4 is thoroughly broken; some attacks generate collisions faster than the computation of the hash function itself !). So the collisions do not imply the HMAC attack, but both attacks feed on the same source. Note, though, that the forgery attack has cost 258, which is quite high (no actual forgery was produced, the result is still theoretical) but substantially lower than the resistance level expected from HMAC (with a robust hash function with an n-bit output, HMAC should resist up to 2n work factor; n = 128 for MD4).

So, while collisions do not per se imply weaknesses on HMAC, they are bad news. In practice, collisions are a problem for very few setups. But knowing whether collisions impact a given usage of hash functions is tricky enough, that it is quite unwise to keep on using a hash function for which collisions were demonstrated.

For SHA-1, the attack is still theoretical, and SHA-1 is widely deployed. The situation has been described like this: "The alarm is on, but there is no visible fire or smoke. It is time to walk towards the exits -- but not to run."

For more information on the subject, begin by reading the chapter 9 of the Handbook of Applied Cryptography, by Menezes, van Oorschot and Vanstone, a must-read for the apprentice cryptographer (not to be confused with "Applied Cryptography" by B. Schneier, which is a well-written introduction but nowhere as thorough as the "Handbook").

薄荷→糖丶微凉 2024-09-09 18:00:20

使用损坏的哈希函数的唯一时间是当冲突的后果无害或微不足道时,例如将文件分配到文件系统上的存储桶时。

The only time it is safe to use a broken hash function is when the consequences of a collision are harmless or trivial, e.g. when assigning files to a bucket on a filesystem.

沙与沫 2024-09-09 18:00:20

当你不在乎它是否安全时。

说真的,在几乎每种语言中使用安全哈希函数都不需要任何额外的努力,而且对性能的影响可以忽略不计,所以我不明白为什么你不这样做。

[实际阅读您的问题后进行编辑]

根据 Schneier 的说法,容易受到冲突攻击的哈希函数仍然可以用作 HMAC。我相信这是因为 HMAC 的安全性取决于其秘​​密密钥,并且在获得该密钥之前无法发现冲突。

实际上,本质上是因为能够生成哈希冲突并不一定可以帮助您生成 hash-of-a-hash(与 HMAC 使用的异或运算相结合)。

如果将盐附加到密码上,那么使用 md4 等非常弱的消息摘要作为密码是否会变得安全?

不,如果哈希值具有原像攻击,允许您将数据添加到输入中,则不会。例如,如果哈希值是 H(pass + salt),我们需要进行原像攻击来找到 pass2,使得 H(pass2 + salt) = H(pass +盐)。

过去曾发生过附加攻击,所以我确信前置攻击是可能的。

When you don't care whether it's safe or not.

Seriously, it doesn't take any extra effort to use a secure hash function in pretty much every language, and performance impact is negligible, so I don't see why you wouldn't.

[Edit after actually reading your question]

According to Schneier a hash function vulnerable to a collsion attack can still be used as an HMAC. I believe this is because the security of an HMAC is Dependant on its secret key and a collision cannot be found until this key is obtained.

Actually, it's essentially because being able to generate a collision for a hash does not necessarily help you generate a collision for the hash-of-a-hash (combined with the XORing used by HMACs).

Does it then become safe to use a very weak message digest like md4 for passwords if a salt is perpended to the password?

No, not if the hash has a preimage attack which allows you to prepend data to the input. For instance, if the hash was H(pass + salt), we'd need a preimage attack which allows us to find pass2 such that H(pass2 + salt) = H(pass + salt).

There have been append attacks in the past, so I'm sure prepend attacks are possible.

可爱暴击 2024-09-09 18:00:20

下载站点使用 MD5 哈希值作为校验和来确定文件在下载过程中是否已损坏,我认为损坏的哈希值足以满足此目的。

假设 MITM 决定修改文件(例如 zip 存档或 exe)。现在,攻击者必须做两件事 -

  1. 找到哈希冲突并从中创建修改后的文件
  2. 确保新创建的文件也是有效的 exe 或 zip 存档

具有损坏的哈希,1比较容易一点。但确保碰撞同时满足文件的其他已知属性的计算成本太高。

这完全是我自己的答案,我可能是非常错误的。

Download sites use MD5 hash as a checksum to determine if the file was corrupted during download, and I would say a broken hash is good enough for that purpose.

Lets say that a MITM decides to modify the file (say a zip archive, or an exe). Now, the attacker has to do two things -

  1. Find a hash collision and create a modified file out of it
  2. Ensure that the newly created file is also a valid exe or a zip archive

With a broken hash, 1 is a bit easier. But ensuring that the collision simultaneously meets other known properties of the file is too expensive computationally.

This is totally my own answer, and I could be terribly wrong.

相思故 2024-09-09 18:00:20

答案完全取决于您使用它的用途。如果您需要在几毫秒内防止某人产生碰撞,那么我会比您需要在几十年内防止某人产生碰撞更担心。

您实际上想解决什么问题?

The answer entirely depends on what you're using it for. If you need to prevent somebody producing a collision with a few milliseconds I'd be less worried than if you need to prevent somebody producing a collision within a few decades.

What problem are you actually trying to solve?

煮茶煮酒煮时光 2024-09-09 18:00:20

大多数对使用 MD4 之类的密码作为密码的担忧与当前已知的攻击无关,而是与以下事实相关:一旦分析到冲突生成很容易,通常认为某人更有可能将能够利用这些知识来创建原像攻击——当/如果发生这种情况时,基本上该哈希函数的所有可能用途都会变得容易受到攻击。

Most of the worry about using something like MD4 for a password is related less to currently known attacks, than to the fact that once it has been analyzed to the point that collision generation is easy, it is generally presumed to be considerably more likely that somebody will be able to use that knowledge to create a preimage attack -- and when/if that happens, essentially all possible uses of that hash function become vulnerable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文