您会编写哪些测试来验证 MD5 实现的正确性?

发布于 2024-10-31 09:39:36 字数 352 浏览 0 评论 0原文

假设您有权访问“oracle”实现,您相信其输出是正确的。

最明显的方法似乎是通过实现运行一组已知的明文/哈希组合,并查看它们是否按预期输出。可以通过生成随机明文(使用静态种子以保持确定性)并使用预言机查找其哈希值来构造任意数量的此类情况。

我看到的主要问题是它不能保证能够解决可能的极端情况。生成更多案例将减少遗漏极端案例的可能性,但是多少个案例就足够了?

还有一个附带问题是指定这些随机明文的长度,因为 MD5 将任意长度的字符串作为输入。就我的目的而言,我不关心长输入(例如,任何超过 16 字节的内容),因此您可以在答案中使用这是一个“特殊用途”MD5 实现这一事实,如果它使事情变得更简单,或者您可以如果都是一样的话,回答一般情况。

Assume you have access to an "oracle" implementation whose output you trust to be correct.

The most obvious way to do this seems to be to run a set of known plaintext/hash combinations through the implementation and see if they come out as expected. An arbitrary number of these cases could be constructed by generating random plaintexts (using a static seed to keep it deterministic) and using the oracle to find their hashes.

The major problem I see with this is that it's not guaranteed to hit possible corner cases. Generating more cases will reduce the likelihood of missing corner cases, but how many cases is enough?

There's also the side issue of specifying the lengths of these random plaintexts because MD5 takes an arbitrary-length string as input. For my purposes, I don't care about long inputs (say, anything longer than 16 bytes), so you can use the fact that this is a "special purpose" MD5 implementation in your answer if it makes things simpler or you can just answer for the general case if it's all the same.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南城旧梦 2024-11-07 09:39:36

如果出现算法错误,则很可能每个哈希值都是错误的。哈希本质上是无情的。

由于大多数可能的错误很快就会暴露出来,因此您实际上不需要那么多测试。要涵盖的主要内容是边缘情况:

  • 长度=0(输入为空)
  • 长度=1
  • 长度=16
  • 输入至少包含一个值为 0 的字节
  • 输入中重复的字节模式(这对于 MD5 来说是一种有意义的边缘情况吗? ?)

如果这些都通过了,也许再加上对一两个以上代表性输入的测试,您可能会对您的算法非常有信心。没有那么多边缘情况(除非更熟悉算法细节的人可以想到更多)。

If you have an algorithmic error, it's extremely likely that every hash will be wrong. Hashes are unforgiving by nature.

Since the majority of possible errors will be exposed quickly, you really won't need that many tests. The main things to cover are the edge cases:

  • Length=0 (input is empty)
  • Length=1
  • Length=16
  • Input contains at least one byte with value 0
  • Repeated patterns of bytes in the input (would this be a meaningful edge case for MD5?)

If those all pass, perhaps along with tests for one or two more representative inputs, you could be pretty confident in your algorithm. There aren't that many edge cases (unless someone more familiar with the algorithm's details can think of some more).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文