我应该选择哪种加密哈希函数?
.NET 框架附带 6 种不同的哈希算法:
- MD5:16 字节(哈希 500MB 的时间:1462 毫秒)
- SHA-1:20 字节(1644 毫秒)
- SHA256:32 字节(5618 毫秒)
- SHA384:48 字节(3839 毫秒)
- SHA512:64 字节(3820 毫秒)
- RIPEMD:20 字节(7066 毫秒)
这些函数的执行方式各不相同; MD5 最快,RIPEMD 最慢。
MD5的优点是适合内置Guid类型; 它是类型 3 UUID 的基础。 SHA-1 哈希是类型 5 UUID 的基础。 这使得它们真的很容易用于识别。
然而,MD5 容易受到碰撞攻击,SHA-1 也容易受到攻击,但程度较轻。
在什么情况下应该使用哪种哈希算法?
我真的很想看到答案的具体问题是:
MD5 不可信吗? 在正常情况下,当您没有恶意地使用 MD5 算法并且没有第三方有任何恶意时,您会期望发生任何冲突(意味着两个任意 byte[] 产生相同的哈希值)
RIPEMD 比 SHA1 好多少? (如果有更好的话)计算速度慢 5 倍,但哈希大小与 SHA1 相同。
散列文件名(或其他短字符串)时发生非恶意冲突的几率有多大? (例如,具有相同 MD5 哈希值的 2 个随机文件名)(使用 MD5 / SHA1 / SHA2xx)一般来说,非恶意冲突的几率是多少?
这是我使用的基准:
static void TimeAction(string description, int iterations, Action func) {
var watch = new Stopwatch();
watch.Start();
for (int i = 0; i < iterations; i++) {
func();
}
watch.Stop();
Console.Write(description);
Console.WriteLine(" Time Elapsed {0} ms", watch.ElapsedMilliseconds);
}
static byte[] GetRandomBytes(int count) {
var bytes = new byte[count];
(new Random()).NextBytes(bytes);
return bytes;
}
static void Main(string[] args) {
var md5 = new MD5CryptoServiceProvider();
var sha1 = new SHA1CryptoServiceProvider();
var sha256 = new SHA256CryptoServiceProvider();
var sha384 = new SHA384CryptoServiceProvider();
var sha512 = new SHA512CryptoServiceProvider();
var ripemd160 = new RIPEMD160Managed();
var source = GetRandomBytes(1000 * 1024);
var algorithms = new Dictionary<string,HashAlgorithm>();
algorithms["md5"] = md5;
algorithms["sha1"] = sha1;
algorithms["sha256"] = sha256;
algorithms["sha384"] = sha384;
algorithms["sha512"] = sha512;
algorithms["ripemd160"] = ripemd160;
foreach (var pair in algorithms) {
Console.WriteLine("Hash Length for {0} is {1}",
pair.Key,
pair.Value.ComputeHash(source).Length);
}
foreach (var pair in algorithms) {
TimeAction(pair.Key + " calculation", 500, () =>
{
pair.Value.ComputeHash(source);
});
}
Console.ReadKey();
}
The .NET framework ships with 6 different hashing algorithms:
- MD5: 16 bytes (Time to hash 500MB: 1462 ms)
- SHA-1: 20 bytes (1644 ms)
- SHA256: 32 bytes (5618 ms)
- SHA384: 48 bytes (3839 ms)
- SHA512: 64 bytes (3820 ms)
- RIPEMD: 20 bytes (7066 ms)
Each of these functions performs differently; MD5 being the fastest and RIPEMD being the slowest.
MD5 has the advantage that it fits in the built-in Guid type; and it is the basis of the type 3 UUID. SHA-1 hash is the basis of type 5 UUID. Which makes them really easy to use for identification.
MD5 however is vulnerable to collision attacks, SHA-1 is also vulnerable but to a lesser degree.
Under what conditions should I use which hashing algorithm?
Particular questions I'm really curious to see answered are:
Is MD5 not to be trusted? Under normal situations when you use the MD5 algorithm with no malicious intent and no third party has any malicious intent would you expect ANY collisions (meaning two arbitrary byte[] producing the same hash)
How much better is RIPEMD than SHA1? (if its any better) its 5 times slower to compute but the hash size is the same as SHA1.
What are the odds of getting non-malicious collisions when hashing file-names (or other short strings)? (Eg. 2 random file-names with same MD5 hash) (with MD5 / SHA1 / SHA2xx) In general what are the odds for non-malicious collisions?
This is the benchmark I used:
static void TimeAction(string description, int iterations, Action func) {
var watch = new Stopwatch();
watch.Start();
for (int i = 0; i < iterations; i++) {
func();
}
watch.Stop();
Console.Write(description);
Console.WriteLine(" Time Elapsed {0} ms", watch.ElapsedMilliseconds);
}
static byte[] GetRandomBytes(int count) {
var bytes = new byte[count];
(new Random()).NextBytes(bytes);
return bytes;
}
static void Main(string[] args) {
var md5 = new MD5CryptoServiceProvider();
var sha1 = new SHA1CryptoServiceProvider();
var sha256 = new SHA256CryptoServiceProvider();
var sha384 = new SHA384CryptoServiceProvider();
var sha512 = new SHA512CryptoServiceProvider();
var ripemd160 = new RIPEMD160Managed();
var source = GetRandomBytes(1000 * 1024);
var algorithms = new Dictionary<string,HashAlgorithm>();
algorithms["md5"] = md5;
algorithms["sha1"] = sha1;
algorithms["sha256"] = sha256;
algorithms["sha384"] = sha384;
algorithms["sha512"] = sha512;
algorithms["ripemd160"] = ripemd160;
foreach (var pair in algorithms) {
Console.WriteLine("Hash Length for {0} is {1}",
pair.Key,
pair.Value.ComputeHash(source).Length);
}
foreach (var pair in algorithms) {
TimeAction(pair.Key + " calculation", 500, () =>
{
pair.Value.ComputeHash(source);
});
}
Console.ReadKey();
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
在密码学中,哈希函数提供三个独立的函数。
这些属性相关但独立。 例如,抗碰撞性意味着第二原像抗性,但反之则不然。 对于任何给定的应用程序,您都会有不同的要求,需要一个或多个这些属性。 用于保护服务器上密码的散列函数通常只需要原像抵抗,而消息摘要则需要全部三个。
MD5 已被证明不具有抗碰撞性,但这并不妨碍其在不需要抗碰撞性的应用中使用。 事实上,MD5 通常仍用于较小密钥大小和速度有利的应用中。 尽管如此,由于其缺陷,研究人员建议在新场景中使用其他哈希函数。
SHA1 有一个缺陷,理论上可以用远少于其长度的安全散列函数所需的 2^80 步来发现冲突。 该攻击不断被修改,目前可以在大约 2^63 步内完成 - 仅仅在当前的可计算范围内(截至 2009 年 4 月)。 因此,NIST 正在逐步淘汰 SHA1 的使用,并指出 SHA2 系列应在 2010 年之后使用。SHA2
是继 SHA1 之后创建的新哈希函数系列。 目前没有针对 SHA2 函数的已知攻击。 SHA256、384 和 512 都是 SHA2 系列的一部分,只是使用不同的密钥长度。
我无法对 RIPEMD 发表太多评论,只是要注意它不像 SHA 系列那样常用,因此没有受到密码研究人员的严格审查。 仅出于这个原因,我就建议使用 SHA 函数。 在您使用的实现中,它似乎也很慢,这使得它不太有用。
总之,没有一种最好的功能 - 这完全取决于您需要它的用途。 请注意每个缺陷,您将能够最好地为您的场景选择正确的哈希函数。
⚠️ 警告
2022 年 8 月
请勿将 SHA-1 或 MD5 用于加密应用程序。这两种算法均已损坏(MD5 可以在手机30秒)。
In cryptography, hash functions provide three separate functions.
These properties are related but independent. For example, collision resistance implies second preimage resistance, but not the other way around. For any given application, you will have different requirements, needing one or more of these properties. A hash function for securing passwords on a server will usually only require preimage resistance, while message digests require all three.
It has been shown that MD5 is not collision resistant, however, that does not preclude its use in applications that do not require collision resistance. Indeed, MD5 is often still used in applications where the smaller key size and speed are beneficial. That said, due to its flaws, researchers recommend the use of other hash functions in new scenarios.
SHA1 has a flaw that allows collisions to be found in theoretically far less than the 2^80 steps a secure hash function of its length would require. The attack is continually being revised and currently can be done in ~2^63 steps - just barely within the current realm of computability (as of April, 2009). For this reason NIST is phasing out the use of SHA1, stating that the SHA2 family should be used after 2010.
SHA2 is a new family of hash functions created following SHA1. Currently there are no known attacks against SHA2 functions. SHA256, 384 and 512 are all part of the SHA2 family, just using different key lengths.
RIPEMD I can't comment too much on, except to note that it isn't as commonly used as the SHA families, and so has not been scrutinized as closely by cryptographic researchers. For that reason alone I would recommend the use of SHA functions over it. In the implementation you are using it seems quite slow as well, which makes it less useful.
In conclusion, there is no one best function - it all depends on what you need it for. Be mindful of the flaws with each and you will be best able to choose the right hash function for your scenario.
⚠️ WARNING
August, 2022
DO NOT USE SHA-1 OR MD5 FOR CRYPTOGRAPHIC APPLICATIONS. Both of these algorithms are broken (MD5 can be cracked in 30 seconds by a cell phone).
所有哈希函数都“损坏”
鸽子洞原则说,尽你所能,你不能在2个空间里容纳超过2只鸽子洞(除非你把鸽子切碎)。 同样,您不能将 2^128 + 1 个数字放入 2^128 个槽中。 所有散列函数都会产生有限大小的散列,这意味着如果您搜索“有限大小”+ 1 序列,则始终可以找到冲突。 这样做是不可行的。 不适用于 MD5,不适用于 Skein。
MD5/SHA1/Sha2xx 没有机会碰撞
所有的哈希函数都会发生冲突,这是不争的事实。 偶然遇到这些碰撞就相当于中了星际彩票。 也就是说,没有人赢得星际彩票< /a>,这不是彩票的运作方式。 您永远不会遇到意外的 MD5/SHA1/SHA2XXX 哈希值。 每本字典、每种语言中的每个单词都会哈希为不同的值。 整个地球上每台计算机上的每个路径名都有不同的 MD5/SHA1/SHA2XXX 哈希值。 你可能会问,我怎么知道这一点。 好吧,正如我之前所说,从来没有人中过星际彩票。
但是...MD5 被破坏了
有时,它的损坏并不重要。
目前尚无已知的针对 MD5 的原像或第二原像攻击。
那么您可能会问,MD5 有什么问题呢? 第三方有可能生成 2 条消息,其中一条是 EVIL,另一条是 GOOD,两者都哈希为相同的值。 (碰撞攻击)
尽管如此,如果您需要原像,当前的 RSA 建议不要使用 MD5反抗。 当涉及到安全算法时,人们往往会过于谨慎。
那么我应该在.NET 中使用什么哈希函数呢?
跟着我重复一遍,不可能发生 MD5 冲突,恶意冲突是可以精心设计的。 尽管迄今为止还没有已知的针对 MD5 的原像攻击,但安全专家的观点是,在需要防御原像攻击的情况下不应使用 MD5。 SHA1 相同。
请记住,并非所有算法都需要防御原像或碰撞攻击。 以首次搜索硬盘上的重复文件为例。
没有人发现任何 SHA512 冲突。 曾经。 他们真的很努力。 就此而言,没有人发现任何 SHA256 或 384 冲突。 。
RIPMED 没有受到与 SHAX 和 MD5 同等程度的审查。 SHA1 和 RIPEMD 都容易受到生日攻击。 它们都比 .NET 上的 MD5 慢,并且大小为尴尬的 20 字节。 使用这些功能毫无意义,忘记它们吧。
SHA1 碰撞攻击已降至 2^52,在 SHA1 碰撞出现之前,不会持续太久。
有关各种哈希函数的最新信息,请查看哈希函数动物园。
但是等等还有更多
拥有快速哈希函数可能是一个诅咒。 例如:哈希函数的一个非常常见的用途是密码存储。 本质上,您可以计算密码与已知随机字符串相结合的哈希值(以阻止彩虹攻击),并将该哈希值存储在数据库中。
问题是,如果攻击者获得数据库的转储,他就可以使用暴力破解相当有效地猜测密码。 他尝试的每个组合只需要几分之一毫秒,而他每秒可以尝试数十万个密码。
要解决此问题,可以使用 bcrypt 算法,它的设计速度很慢,因此如果使用 bcrypt 攻击系统,攻击者的速度将会大大减慢。 最近 scrypt 成为了一些头条新闻并被认为有些人认为比 bcrypt 更有效,但我不知道 .Net 实现。
All hash functions are "broken"
The pigeonhole principle says that try as hard as you will you can not fit more than 2 pigeons in 2 holes (unless you cut the pigeons up). Similarly you can not fit 2^128 + 1 numbers in 2^128 slots. All hash functions result in a hash of finite size, this means that you can always find a collision if you search through "finite size" + 1 sequences. It's just not feasible to do so. Not for MD5 and not for Skein.
MD5/SHA1/Sha2xx have no chance collisions
All the hash functions have collisions, its a fact of life. Coming across these collisions by accident is the equivalent of winning the intergalactic lottery. That is to say, no one wins the intergalactic lottery, its just not the way the lottery works. You will not come across an accidental MD5/SHA1/SHA2XXX hash, EVER. Every word in every dictionary, in every language, hashes to a different value. Every path name, on every machine in the entire planet has a different MD5/SHA1/SHA2XXX hash. How do I know that, you may ask. Well, as I said before, no one wins the intergalactic lottery, ever.
But ... MD5 is broken
Sometimes the fact that its broken does not matter.
As it stands there are no known pre-image or second pre-image attacks on MD5.
So what is so broken about MD5, you may ask? It is possible for a third party to generate 2 messages, one of which is EVIL and another of which is GOOD that both hash to the same value. (Collision attack)
Nonetheless, the current RSA recommendation is not to use MD5 if you need pre-image resistance. People tend to err on the side of caution when it comes to security algorithms.
So what hash function should I use in .NET?
Repeat this after me, there are no chance MD5 collisions, malicious collisions can be carefully engineered. Even though there are no known pre-image attacks to date on MD5 the line from the security experts is that MD5 should not be used where you need to defend against pre-image attacks. SAME goes for SHA1.
Keep in mind, not all algorithms need to defend against pre-image or collision attacks. Take the trivial case of a first pass search for duplicate files on your HD.
No one ever found any SHA512 collision. EVER. They have tried really hard. For that matter no one ever found any SHA256 or 384 collision ever. .
RIPMED has not received the same amount of scrutiny that SHAX and MD5 has received. Both SHA1 and RIPEMD are vulnerable to birthday attacks. They are both slower than MD5 on .NET and come in the awkward 20 byte size. Its pointless to use these functions, forget about them.
SHA1 collision attacks are down to 2^52, its not going to be too long until SHA1 collisions are out in the wild.
For up to date information about the various hash functions have a look at the hash function zoo.
But wait there is more
Having a fast hash function can be a curse. For example: a very common usage for hash functions is password storage. Essentially, you calculate hash of a password combined with a known random string (to impede rainbow attacks) and store that hash in the database.
The problem is, that if an attacker gets a dump of the database, he can, quite effectively guess passwords using brute-force. Every combination he tries only takes a fraction of millisecond, and he can try out hundreds of thousands of passwords a second.
To work around this issue, the bcrypt algorithm can be used, it is designed to be slow so the attacker will be heavily slowed down if attacking a system using bcrypt. Recently scrypt has made some headline and is considered by some to be more effective than bcrypt but I do not know of a .Net implementation.
更新:
时代变了,我们有了 SHA3 获胜者。 我建议使用 keccak (又名 SHA3) SHA3 竞赛获胜者。
原始答案:
按照从最弱到最强的顺序,我会说:
就我个人而言,我会使用 MD6,因为一个人永远不会太偏执。 如果速度确实是一个问题,我会考虑 Skein 或 SHA-256。
Update:
Times have changed, we have a SHA3 winner. I would recommend using keccak (aka SHA3) winner of the SHA3 contest.
Original Answer:
In order of weakest to strongest I would say:
Personally I'd use MD6, because one can never been too paranoid. If speed is a real concern I'd look at Skein, or SHA-256.
MD5 的辩护是,没有已知的方法可以生成具有任意 MD5 哈希值的文件。 原作者一定要提前谋划,才能有工作上的碰撞。 因此,如果接收方信任发送方,MD5 就可以。 如果签名者是恶意的,MD5 就会被破坏,但尚不清楚它是否容易受到中间人攻击。
In MD5's defense, there is no known way to produce a file with an arbitrary MD5 hash. The original author must plan in advance to have a working collision. Thus if the receiver trusts the sender, MD5 is fine. MD5 is broken if the signer is malicious, but it is not known to be vulnerable to man-in-the-middle attacks.
看一下 BLAKE2 算法是个好主意。
正如所描述的,它比 MD5 更快,并且至少与 SHA-3 一样安全。 它也由多个软件应用程序实现,包括 WinRar。
It would be a good ideea to take a look at the BLAKE2 algorythm.
As it is described, it is faster than MD5 and at least as secure as SHA-3. It is also implemented by several software applications, including WinRar.
您使用哪一种实际上取决于您使用它的目的。 如果您只是想确保文件在传输过程中不会被损坏并且不太关心安全性,那么请选择速度快且体积小的文件。 如果您需要数十亿美元的联邦救助协议的数字签名,并需要确保它们不是伪造的,请选择难以欺骗和缓慢的方式。
Which one you use really depends on what you are using it for. If you just want to make sure that files don't get corrupted in transit and aren't that concerned about security, go for fast and small. If you need digital signatures for multi-billion dollar federal bailout agreements and need to make sure they aren't forged, go for hard to spoof and slow.
我想插话一下(在 md5 被撕裂之前),尽管 md5 对很多加密货币来说具有压倒性的破坏性,但我仍然广泛使用 md5。
只要您不关心防止冲突(在 hmac 中使用 md5 仍然是安全的)并且您确实想要速度(有时您想要较慢的哈希值),那么您仍然可以放心地使用 md5。
I would like to chime in (before md5 gets torn apart) that I do still use md5 extensively despite its overwhelming brokenness for a lot of crypto.
As long as you don't care to protect against collisions (you are still safe to use md5 in an hmac as well) and you do want the speed (sometimes you want a slower hash) then you can still use md5 confidently.
我不是这类事情的专家,但我关注安全社区,那里的很多人都认为 md5 哈希值已损坏。 我想说,使用哪一种取决于数据的敏感程度和具体应用。 只要密钥良好且强大,您就可以使用安全性稍差的哈希值。
I am not an expert at this sort of thing, but I keep up with the security community and a lot of people there consider the md5 hash broken. I would say that which one to use depends on how sensitive the data is and the specific application. You might be able to get away with a slightly less secure hash as long as the key is good and strong.
以下是我给您的建议:
请参阅此处的一篇论文,详细介绍了在 31 秒内与台式 Intel P4 计算机创建 md5 冲突的算法。
http://eprint.iacr.org/2006/105
Here are my suggestions for you:
See here for a paper detailing an algorithm to create md5 collisions in 31 seconds with a desktop Intel P4 computer.
http://eprint.iacr.org/2006/105