递归MD5和碰撞概率
我想知道将一堆 MD5 哈希值哈希在一起以创建新哈希是否“安全”,或者这是否会以任何方式增加冲突的可能性。
背景:我有几个具有依赖关系的文件。每个文件都有一个关联的哈希值,该值是根据其内容计算得出的。我们将其称为“单文件”哈希值。除此之外,文件还应该有一个包含所有依赖文件的哈希值,即“多文件”哈希值。
所以问题是:我是否可以获取依赖文件的所有单文件 MD5 哈希值,将它们连接起来,然后对连接值计算 MD5 以获得多文件哈希值。或者,这会导致 MD5 哈希值比我将所有依赖文件的内容连接在一起更有可能发生冲突吗?
或者,我是否可以将单文件哈希值异或在一起以生成多文件哈希值,或者这可能会导致更多冲突?
I wonder if it is 'safe' to hash a bunch of MD5 hash values together to create a new hash or whether this will in any way increase the probability of collisions.
The background: I have a couple of files with dependencies. Each file has an associated hash value which is calculated based on it's content. Let's call this the 'single-file' hash value. In addition to this, the file should also have a hash value which includes all the dependent files, the 'multi-file' hash value.
So the question is: Can I just take all the single-file MD5 hash values of the dependent files, concatenate them and then calculate an MD5 over the concatenated values to get the multi-file hash value. Or will this result in an MD5 hash that is more likely to collide than if I would concatenate the content of all dependent files together.
Alternatively, could I xor the single-file hash values together to generate a multi-file hash value, or would this likely result in more collisions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
听起来你需要一个 默克尔树
Sounds like you need a Merkel Tree
MD5 有很多冲突问题,请参阅Wikipedia 上的 MD5 条目。
但是,如果您使用 MD5 不是为了安全,而是作为检查依赖关系的唯一标记,那么即使散列连接的散列也应该非常安全。
或者,如果还不算太晚,请切换到 SHA-1。
MD5 has a lot of collision problems, see MD5 entry on Wikipedia.
However, if you use MD5 not for security but as a unique marker to check dependencies, even hashing contatenated hashes should be pretty safe.
Or, if it's not too late, switch to SHA-1.
我认为对串联文件进行哈希处理时发生冲突的风险与对串联文件哈希值进行哈希处理时发生冲突的风险大致相同。
I think the risks of a collision is about the same for hashing the concatenated files, as to hashing the concatenated file hashes.