SHA-1 创建哈希值需要多长时间?
创建数据的 SHA-1 哈希值大约需要多长时间以及多少处理能力?根据原始数据大小,这是否有很大差异?生成标准 HTML 文件的哈希值会比字符串“blah”花费更长的时间吗? C++、Java 和 PHP 的速度比较如何?
Roughly how long, and how much processing power is required to create SHA-1 hashes of data? Does this differ a lot depending on the original data size? Would generating the hash of a standard HTML file take significantly longer than the string "blah"? How would C++, Java, and PHP compare in speed?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你问了很多问题,所以希望我能依次回答每一个问题。
SHA-1(以及许多其他设计为加密强度高的哈希值)基于对固定大小的数据块重复应用加密或解密例程。因此,当计算长字符串的哈希值时,该算法比计算小字符串的哈希值花费的时间成比例地更多。从数学上讲,我们说使用 SHA-1 时,对长度为 N 的字符串进行哈希处理的运行时间为 O(N)。因此,散列 HTML 文档应该比散列字符串“blah”花费更长的时间,但只是成比例的。进行哈希运算不会花费更长的时间。
至于比较 C++、Java 和 PHP 的速度,这是一个危险的领域,我的答案可能会遭到抨击,但一般来说 C++ 比 Java 稍快,而 Java 又比 PHP 稍快。如果用其中一种语言编写的良好哈希实现可能会比其他语言编写得好,那么它们的性能可能会显着优于其他语言。不过,您不必担心这一点。通常认为实现自己的哈希函数、加密例程或解密例程是一个坏主意,因为它们通常容易受到 旁道攻击,其中攻击者可以利用实现中的错误来破坏您的安全,而这些错误通常是很难预料的。如果您想使用好的哈希函数,请使用预先编写的版本。与手工操作相比,它可能更快、更安全、更不易出错。
最后,我建议根本不要使用 SHA-1。 SHA-1 具有已知的加密弱点,您应该考虑使用强大的哈希算法,例如 SHA-256。
希望这有帮助!
You've asked a lot of questions, so hopefully I can try to answer each one in turn.
SHA-1 (and many other hashes designed to be cryptographically strong) are based on repeated application of an encryption or decryption routine to fixed-sized blocks of data. Consequently, when computing a hash value of a long string, the algorithm takes proportionally more time than computing the hash value of a small string. Mathematically, we say that the runtime to hash a string of length N is O(N) when using SHA-1. Consequently, hashing an HTML document should take longer than hashing the string "blah," but only proportionally so. It won't take dramatically longer to do the hash.
As for comparing C++, Java, and PHP in terms of speed, this is dangerous territory and my answer is likely to get blasted, but generally speaking C++ is slightly faster than Java, which is slightly faster than PHP. A good hash implementation written in one of those languages might dramatically outperform the others if they aren't written well. However, you shouldn't need to worry about this. It is generally considered a bad idea to implement your own hash functions, encryption routines, or decryption routines because they are often vulnerable to side-channel attacks in which an attacker can break your security by using bugs in the implementation that are often extremely difficult to have anticipated. If you want to use a good hash function, use a prewritten version. It's likely to be faster, safer, and less error-prone than anything you do by hand.
Finally, I'd suggest not using SHA-1 at all. SHA-1 has known cryptographic weaknesses and you should consider using a strong hash algorithm instead, such as SHA-256.
Hope this helps!
加密哈希函数的“速度”通常以“每字节时钟周期每字节”来衡量。请参阅此页面以获取不可否认的过时比较 - 您可以看到实现和架构如何影响结果。结果的差异很大程度上不仅取决于所使用的算法,而且很大程度上取决于您的处理器架构、实现的质量以及实现是否有效地使用硬件。这就是为什么一些公司专注于创建特别适合尽可能高效地执行某些加密算法的确切目的的硬件。
SHA-512 就是一个很好的例子,尽管它比 SHA-256 处理更大的数据块,但人们可能倾向于认为它通常在处理较小输入时比 SHA-256 执行得更慢 - 但 SHA-512 特别适合 64位处理器,有时性能甚至比 SHA-256 更好。
所有现代哈希算法都适用于固定大小的数据块。他们对一个块执行固定数量的确定性操作,并对每个块执行此操作,直到最终得到结果。这也意味着您输入的时间越长,操作所需的时间就越长。从刚才解释的特征我们可以推断出操作的长度与消息的输入大小成正比。从数学上或计算机科学的角度来说,我们将其称为 O(n) 操作,其中 n 是消息的输入大小,正如 templatetypedef 已经指出的那样。
您不应该让散列的速度影响您对编程语言的选择,所有现代散列算法都非常非常快,无论使用哪种语言。尽管基于 C 的实现会比 Java 稍好一些,而 Java 也可能比 PHP 稍快一些,但我敢打赌在实践中您不会知道其中的区别。
The "speed" of cryptographic hash functions is often measured in "clock cycles per byte". See this page for an admittedly outdated comparison - you can see how implementation and architecture influence the results. The results vary largely not only due to the algorithm being used, but they are also largely dependent on your processor architecture, the quality of the implementation and if the implementation uses the hardware efficiently. That's why some companies specialize in creating hardware especially well suited for the exact purpose of performing certain cryptographic algorithms as efficiently as possible.
A good example is SHA-512, although it works on larger data chunks than SHA-256 one might be inclined to think that it should generally perform slower than SHA-256 working on smaller input - but SHA-512 is especially well suited for 64 bit processors and performs sometimes even better than SHA-256 there.
All modern hash algorithms are working on fixed-size blocks of data. They perform a fixed number of deterministic operations on a block, and do this for every block until you finally get the result. This also means that the longer your input, the longer the operation will take. From the characteristics just explained we can deduce that the length of the operation is directly proportional to the input size of a message. Mathematically oŕ computer-scientifically speaking we coin this as being an O(n) operation, where n is the input size of the message, as templatetypedef already pointed out.
You should not let the speed of hashing influence your choice of programming language, all modern hash algorithms are really, really fast, regardless of the language. Although C-based implementations will do slightly better than Java, which again will probably be slightly faster than PHP, I bet in practice you won't know the difference.
SHA-1 按 64 字节块处理数据。因此,散列长度为 n 字节的文件所需的 CPU 时间大致等于处理一个块所需的 CPU 时间的 n/64 倍。对于短字符串,必须首先将字符串转换为字节序列(SHA-1 适用于字节,而不适用于字符);字符串
"blah"
将变为 4 或 8 字节(如果分别使用 UTF-8 或 UTF-16),因此它将被散列为单个块。请注意,从字符到字节的转换可能比哈希本身花费更多的时间。在我的 PC(x86 Core2、2.4 GHz、64位模式),我可以以 132 MB/s 的带宽(使用单个 CPU 核心)对长消息进行哈希处理。请注意,这超出了普通硬盘的速度,因此在对大文件进行哈希处理时,磁盘很可能成为瓶颈,而不是 CPU:对文件进行哈希处理所需的时间将是读取所需的时间 磁盘上的文件。
(此外,使用 C 语言编写的本机代码,SHA-1 速度可达 330 MB/s。)
SHA-256 被认为比 SHA-1 更安全,SHA-256 的纯 Java 实现在我的 PC 上的速度为 85 MB/s,这仍然相当快。自 2011 年起,不推荐使用 SHA-1。
SHA-1 processes the data by chunks of 64 bytes. The CPU time needed to hash a file of length n bytes is thus roughly equal to n/64 times the CPU time needed to process one chunk. For a short string, you must first convert the string to a sequence of bytes (SHA-1 works on bytes, not on characters); the string
"blah"
will become 4 or 8 bytes (if you use UTF-8 or UTF-16, respectively) so it will be hashed as a single chunk. Note that the conversion from characters to bytes may take more time than the hashing itself.Using the pure Java SHA-1 implementation from sphlib, on my PC (x86 Core2, 2.4 GHz, 64-bit mode), I can hash long messages at a bandwidth of 132 MB/s (that's using a single CPU core). Note that this exceeds the speed of a common hard disk, so when hashing a big file, chances are that the disk will be the bottleneck, not the CPU: the time needed to hash the file will be the time needed to read the file from the disk.
(Also, using native code written in C, SHA-1 speed goes up to 330 MB/s.)
SHA-256 is considered to be widely more secure than SHA-1, and a pure Java implementation of SHA-256 ranks at 85 MB/s on my PC, which is still quite fast. As of 2011, SHA-1 is not recommended.