我的 md5 位是什么?
我正在尝试用 Python 编写 md5 哈希函数,但它似乎不起作用。我已将问题隔离到要进行哈希处理的消息位。是的,我实际上是将每个字节转换为位并形成位消息(我想在位级别上研究算法)。这就是事情分崩离析的地方;我的位串格式不正确。
最简单的消息是“”,它的长度为 0 字节,填充是一个“1”,后跟(或不)511 个“0”(最后 64 位表示消息长度,正如已经说过的,它只是 0)。
10000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000
我一次将 32 位数据块输入到转换函数中。我尝试手动将 1 定位在第一个块以及最后一个块(小端)的所有位置。 “1”应该在哪里?
谢谢。
更新:输入到转换中的第一个 32 位字的正确位置实际上应该是: 00000000000000000000000010000000
其中 int(x,2)
是 128
这个混乱是由于我的 A = rotL((A+F(B,C,D)+int(messageBits[0],2)+sinList[0]), s11)+B
使用 int()
转换格式将位串解释为整数数据,int()
采用小端格式二进制,因此 100.. ..是一个非常巨大的数字。
I'm trying to code an md5 hashing function in Python but it doesn't seem to work. I've isolated the problem to the message bits that are to be hashed. Yes, I'm actually converting each byte to bits and forming a bit message (I want to study the algorithm on a bit level). And this is where things are falling apart; my bit string is not correctly formed.
The simplest message would be "", it's 0 bytes long, padding would be a "1" followed (or not) by 511 "0"s (last 64 bits denote message length, which, as already said, is just 0).
10000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
I'm feeding 32-bit chunks of data into the transform function at a time. I've tried to manually position the 1 in all the positions of in the first chunk, as well as the last chunk (little endian). Where should the "1" be?
Thank you.
Update: The correct position for the first 32-bit word fed into the transform should in fact be: 00000000000000000000000010000000
which int(x,2)
is 128
this mess is due to my A = rotL((A+F(B,C,D)+int(messageBits[0],2)+sinList[0]), s11)+B
transform format using int()
to interpret the bit strings as integer data, int()
takes little endian format binary, thus 100.... was a very huge number.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
MD5 在位级别使用大端约定,然后在字节级别使用小端约定。
输入是有序的位序列。八个连续位是一个字节。一个字节的数值在 0 到 255 之间;字节中的每个位的值分别为 128、64、32、16、8、4、2 或 1,按该顺序(这就是“位级别的大端序”的含义)。
四个连续字节是一个 32 位字。该字的数值介于 0 和 4294967295 之间。第一个字节在该字中是最低有效(“字节级别的小尾数法”)。因此,如果四个字节按顺序为 a、b、c 和 d,则单词 numeric值为a+256*b+65536*c+16777216*d。
在软件应用程序中,输入几乎总是字节序列(其长度以位为单位,是 8 的倍数)。假定已经发生了比特到字节的聚合。因此,额外的“1”填充位将是下一个字节的第一位,并且由于位级约定是大端字节序,因此下一个字节将具有数值 128 (0x80)。
对于空消息,第一位将是“1”填充位,后面跟着一大堆零。消息长度也为零,这还编码了其他零。因此,填充的消息块将是一个“1”,后跟 511 个“0”,如您所示。当位组装成字节时,第一个字节的值为 128,后面跟着 63 个字节的值 0。当字节组装成 32 位字时,第一个字 (M0< /em>)的数值为 128,其他 15 个单词(M1 到 M15)值为 0。
有关详细信息,请参阅 MD5 规范。我上面描述的是 RFC 1321 第 2 节第一段中解释的内容。相同的编码用于消息位长度(在填充末尾),并用于写出最终的哈希结果。
MD5 uses big-endian convention at bit level, then little-endian convention at byte level.
The input is an ordered sequence of bits. Eight consecutive bits are a byte. A byte has a numerical value between 0 and 255; each bit in a byte has value 128, 64, 32, 16, 8, 4, 2 or 1, in that order (that's what "big-endian at bit level" means).
Four consecutive bytes are a 32-bit word. The numerical value of the word is between 0 and 4294967295. The first byte is least significant in that word ("little-endian at byte level"). Hence, if the four bytes are a, b, c and d in that order, then the word numerical value is a+256*b+65536*c+16777216*d.
In software applications, input is almost always a sequence of bytes (its length, in bits, is a multiple of 8). The aggregation of bits into bytes is assumed to have already taken place. Thus, the extra '1' padding bit will be the first bit of the next byte, and, since the bit-level convention is big-endian, that next byte will have numerical value 128 (0x80).
For an empty message, the very first bit will be the '1' padding bit, followed by a whole bunch of zeros. The message length is also zero, which encodes yet other zeros. Therefore, the padded message block will be a single '1' followed by 511 '0', as you show. When bits are assembled into bytes, the first byte will have value 128, followed by 63 bytes of value 0. When bytes are grouped into 32-bit words, the first word (M0) will have numerical value 128, and the 15 other words (M1 to M15) will have numerical value 0.
Refer to the MD5 specification for details. What I describe above is what is explained in the first paragraph of section 2 of RFC 1321. The same encoding is used for the message bit length (at the end of the padding), and for writing out the final hash result.