可以用什么方法来分析和猜测4位校验和算法?
[背景故事]
我正在使用一个已有 5 年历史的用户识别系统,并且我正在尝试将 ID 添加到数据库中。我遇到的问题是,读取 ID 号的系统需要某种校验和,而现在在这里工作的人都没有使用过它,所以没有人知道它是如何工作的。
我可以访问现有 ID 的列表,这些 ID 已经具有正确的校验和。另外,由于校验和只有 16 个可能的值,我可以创建任何我想要的 ID,并通过身份验证系统运行它最多 16 次,直到获得正确的校验和(但这非常耗时)
[问题]< /strong>
我可以使用什么方法来帮助猜测某些数据所使用的校验和算法? 我尝试了一些简单的方法,例如异或和求和,但这些都不起作用。
所以我的问题是:如果我有这样的数据(十六进制):
data checksum
00029921 1
00013481 B
00026001 3
00004541 8
我可以使用什么方法来确定使用了哪种校验和? 即我应该尝试顺序数字,例如 00029921,00029922,00029923,... 或 00029911,00029921,00029931,... 如果我这样做,我应该在不断变化的校验和中寻找什么模式?
同样,比较交换的数字会告诉我有关校验和的任何有用信息吗? 即 00013481 和 00031481
还有什么可以告诉我一些有用的东西吗?反转一位或一位十六进制数字怎么样?
我假设这将是一种常见的校验和算法,但我不知道从哪里开始测试它。 我已阅读以下链接,但我不确定是否可以将这些链接应用于我的案例,因为我不认为我的案例是 CRC。
stackoverflow.com/questions/149617/how-could-i-guess-a-校验和算法 stackoverflow.com/questions/2896753/find-the-algorithm-that-generates-校验和 cosc.canterbury.ac.nz/greg。 ewing/essays/CRC-Reverse-Engineering.html
[ANSWER]
我现在下载了一个更大的数据列表,结果比我简单期待,但为了完整起见,这就是我所做的。
数据:
00024901 A
00024911B
00024921 C
00024931D
00042811A
00042871 0
00042881 1
00042891 2
00042901A
00042921 C
00042961 0
00042971 1
00042981 2
00043021 4
00043031 5
00043041 6
00043051 7
00043061 8
00043071 9
00043081A
00043101 3
00043111 4
00043121 5
00043141 7
00043151 8
00043161 9
00043171A
00044291 E
从这些中,我可以看到,当只有一个值增加一个值时,校验和也会增加相同的值,如下所示:
00024901 A
00024911B
此外,交换两个数字并没有改变校验和:
00024901 A
00042901A
这意味着多项式值(至少对于这两个位置)必须相同
最后,00000000 的校验和是 A,所以我计算了数字之和加上 A mod 16:
( (Σxi) +0xA )mod16
这符合我所有的价值观。只是为了检查我的数据中从未改变的前 3 位数字是否有任何偷偷摸摸的情况,我按照埃里克的建议编造并测试了一些数字,这些数字也都适用!
[Background Story]
I am working with a 5 year old user identification system, and I am trying to add IDs to the database. The problem I have is that the system that reads the ID numbers requires some sort of checksum, and no-one working here now has ever worked with it, so no-one knows how it works.
I have access to the list of existing IDs, which already have correct checksums. Also, as the checksum only has 16 possible values, I can create any ID I want and run it through the authentication system up to 16 times until I get the correct checksum (but this is quite time consuming)
[Question]
What methods can I use to help guess the checksum algorithm of used for some data?
I have tried a few simple methods such as XORing and summing, but these have not worked.
So my question is: if I have data (in hexadecimal) like this:
data checksum
00029921 1
00013481 B
00026001 3
00004541 8
What methods can I use work out what sort of checksum is used?
i.e. should I try sequential numbers such as 00029921,00029922,00029923,... or 00029911,00029921,00029931,... If I do this what patterns should I look for in the changing checksum?
Similarly, would comparing swapped digits tell me anything useful about the checksum?
i.e. 00013481 and 00031481
Is there anything else that could tell me something useful? What about inverting one bit, or maybe one hex digit?
I am assuming that this will be a common checksum algorithm, but I don't know where to start in testing it.
I have read the following links, but I am not sure if I can apply any of this to my case, as I don't think mine is a CRC.
stackoverflow.com/questions/149617/how-could-i-guess-a-checksum-algorithm
stackoverflow.com/questions/2896753/find-the-algorithm-that-generates-the-checksum
cosc.canterbury.ac.nz/greg.ewing/essays/CRC-Reverse-Engineering.html
[ANSWER]
I have now downloaded a much larger list of data, and it turned out to be simpler than I was expecting, but for completeness, here is what I did.
data:
00024901 A
00024911 B
00024921 C
00024931 D
00042811 A
00042871 0
00042881 1
00042891 2
00042901 A
00042921 C
00042961 0
00042971 1
00042981 2
00043021 4
00043031 5
00043041 6
00043051 7
00043061 8
00043071 9
00043081 A
00043101 3
00043111 4
00043121 5
00043141 7
00043151 8
00043161 9
00043171 A
00044291 E
From these, I could see that when just one value was increased by a value, the checksum was also increased by the same value as in:
00024901 A
00024911 B
Also, two digits swapped did not change the checksum:
00024901 A
00042901 A
This means that the polynomial value (for these two positions at least) must be the same
Finally, the checksum for 00000000 was A, so I calculated the sum of digits plus A mod 16:
( (Σxi) +0xA )mod16
And this matched for all the values I had. Just to check that there was nothing sneaky going on with the first 3 digits that never changed in my data, I made up and tested some numbers as Eric suggested, and those all worked with this too!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我见过的许多校验和都使用基于数字位置的简单加权值。例如,如果权重为 3,5,7,则校验和可能为 3*c[0] + 5*c[1] + 7*c[2],然后对结果取模 10。 (在你的例子中,mod 16,因为你有 4 位校验和)
要检查是否是这种情况,我建议你将一些简单的值输入到你的系统中以获得答案:
...等等。如果有简单的权重根据位置,这可能会揭示它。即使算法有所不同,输入漂亮、简单的值并寻找模式可能会有所启发。正如马蒂所建议的,在解码模式之前,您/我们可能需要查看更多样本。
Many checksums I've seen use simple weighted values based on the position of the digits. For example, if the weights are 3,5,7 the checksum might be 3*c[0] + 5*c[1] + 7*c[2], then mod 10 for the result. (In your case, mod 16, since you have 4 bit checksum)
To check if this might be the case, I suggest that you feed some simple values into your system to get an answer:
... etc. If there are simple weights based on position, this may reveal it. Even if the algorithm is something different, feeding in nice, simple values and looking for patterns may be enlightening. As Matti suggested, you/we will likely need to see more samples before decoding the pattern.