给定哈希算法,是否有更有效的方法来“取消哈希”?除了暴力?
所以我有一个哈希函数的代码,从它的外观来看,没有办法简单地取消它的哈希值(大量的按位与、或、移位等)。我的问题是,如果我需要在散列之前找出原始值,是否有比暴力破解一组可能值更有效的方法?
谢谢!
编辑:我应该补充一点,就我而言,出于我的目的,原始消息永远不会超过几个字符。
EDIT2:出于好奇,有没有什么方法可以在运行时执行此操作,而无需预先计算表?
So I have the code for a hashing function, and from the looks of it, there's no way to simply unhash it (lots of bitwise ANDs, ORs, Shifts, etc). My question is, if I need to find out the original value before being hashed, is there a more efficient way than just brute forcing a set of possible values?
Thanks!
EDIT: I should add that in my case, the original message will never be longer than several characters, for my purposes.
EDIT2: Out of curiosity, are there any ways to do this on the run, without precomputed tables?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
“Unhashing”被称为“原像攻击”:给定哈希输出,找到相应的输入。
如果哈希函数是“安全的”,那么没有比尝试可能的输入直到找到命中更好的攻击了;对于具有 n 位输出的哈希函数,哈希函数调用的平均次数约为 2n,即对于如果 n 大于 180 左右,则采用当前的地球技术。换句话说:对于给定的哈希函数,如果找到比该暴力方法更快的攻击方法,则该哈希函数被视为不可修复地损坏。
MD5被认为已被破坏,但由于其他弱点(有一种已发布的原像方法,其成本为2123.4,因此,这比暴力破解成本快了大约 24 倍——但到目前为止,它在技术上仍然不可行,无法得到证实)。
当已知哈希函数输入是相对较小空间的一部分时(例如,它是一个“密码”,因此它可以适合人类用户的大脑),则可以通过使用预先计算的表来优化原像攻击:攻击者仍然需要支付一次搜索成本,但他可以重复使用他的表来攻击多个实例。彩虹表是预先计算的表,具有节省空间的压缩表示形式:使用彩虹表,攻击者的瓶颈是 CPU 能力,而不是硬盘的大小。
"Unhashing" is called a "preimage attack": given a hash output, find a corresponding input.
If the hash function is "secure" then there is no better attack than trying possible inputs until a hit is found; for a hash function with a n-bit output, the average number of hash function invocations will be about 2n, i.e. Way Too Much for current earth-based technology if n is greater than 180 or so. To state it otherwise: if an attack method faster than this brute force method is found, for a given hash function, then the hash function is deemed irreparably broken.
MD5 is considered broken, but for other weaknesses (there is a published method for preimages with cost 2123.4, which is thus about 24 times faster than the brute force cost -- but it is still so far in the technologically unfeasible that it cannot be confirmed).
When the hash function input is known to be part of a relatively small space (e.g. it is a "password", so it could fit in the brain of a human user), then one can optimize preimage attacks by using precomputed tables: the attacker still has to pay the search cost once, but he can reuse his tables to attack multiple instances. Rainbow tables are precomputed tables with a space-efficient compressed representation: with rainbow tables, the bottleneck for the attacker is CPU power, not the size of his hard disks.
假设“正常情况”,原始消息将比哈希值长很多倍。因此,原则上绝对不可能从哈希中导出消息,因为您无法计算不存在的信息。
但是,您可以猜测什么可能是正确的消息,并且存在一些技术可以加速常见消息(例如密码)的此过程,例如彩虹表。如果哈希值匹配,那么看起来合理的消息很可能就是正确的消息。
最后,可能根本不需要找到好消息,只要能找到一条会传递的消息即可。这是已知的 MD5 攻击的主题。这种攻击可以让您创建一条不同的消息,该消息给出相同的哈希值。
这是否是一个安全问题取决于您使用哈希值的具体用途。
Assuming the "normal case", the original message will be many times longer than the hash. Therefore, it is in principle absolutely impossible to derive the message from the hash, simply because you cannot calculate information that is not there.
However, you can guess what's probably the right message, and there exist techniques to accelerate this process for common messages (such as passwords), for example rainbow tables. It is very likely that if something that looks sensible is the right message if the hash matches.
Finally, it may not be necessary at all to find the good message as long as one can be found which will pass. This is the subject of a known attack on MD5. This attack lets you create a different message which gives the same hash.
Whether this is a security problem or not depends on what exactly you use the hash for.
这可能听起来微不足道,但如果您有哈希函数的代码,您始终可以重写哈希表容器类的
hash()
函数(或类似函数,具体取决于您的编程语言和环境)。这样,您可以对 3 个或更少字符的字符串进行哈希处理,然后可以将哈希值存储为密钥,通过该密钥您可以获得原始字符串,这似乎正是您想要的。我想,使用这种方法来构建你自己的彩虹表。如果您有要在其中查找这些值的程序环境的代码,则可以随时修改它以将哈希值存储在哈希表中。This may sound trivial, but if you have the code to the hashing function, you could always override a hash table container class's
hash()
function (or similar, depending on your programming language and environment). That way, you can hash strings of say 3 characters or less, and then you can store the hash as a key by which you obtain the original string, which appears to be exactly what you want. Use this method to construct your own rainbow table, I suppose. If you have the code to the program environment in which you want to find these values out, you could always modify it to store hashes in the hash table.是的; 彩虹表攻击。对于较短字符串的哈希值尤其如此。即像“true”“false”“etc”这样的小字符串的哈希值可以存储在字典中并可以用作比较表。这大大加快了破解过程。此外,如果散列大小很短(即 MD5),则算法变得特别容易破解。当然,解决这个问题的方法是在对密码进行哈希处理之前将“加密盐”与密码结合起来。
关于此事有两个非常好的信息来源:编码恐怖:彩虹哈希破解 和
维基百科: Rainbow 表
编辑:Rainbox 表可以标记数十 GB,因此下载(或复制)它们可能会仅仅进行简单的测试就需要几周的时间。相反,似乎有一些在线工具可以反转简单的哈希值: http://www.onlinehashcrack.com/ (即尝试反转 463C8A7593A8A79078CB5C119424E62A,这是单词“crack”的 MD5 哈希值)
Yes; rainbow table attacks. This is especially true for hashes of shorter strings. i.e. hashes of small strings like 'true' 'false' 'etc' can be stored in a dictionary and can be used as a comparison table. This speeds up cracking process considerably. Also if the hash size is short (i.e. MD5) the algorithm becomes especially easy to crack. Of course, the way around this issue is combining 'cryptographic salts' with passwords, before hashing them.
There are two very good sources of info on the matter: Coding Horror: Rainbow Hash Cracking and
Wikipedia: Rainbow table
Edit: Rainbox tables can tage tens of gigabytes so downloading (or reproducing) them may take weeks just to make simple tests. Instead, there seems to be some online tools for reversing simple hashes: http://www.onlinehashcrack.com/ (i.e. try to reverse 463C8A7593A8A79078CB5C119424E62A which is MD5 hash of the word 'crack')