包含自己的校验和的文件
是否可以创建一个包含自己的校验和(MD5、SHA1 等等)的文件? 对于那些不高兴的小丑来说,我指的是简单的校验和,而不是计算它的函数。
Is it possible to create a file that will contain its own checksum (MD5, SHA1, whatever)? And to upset jokers I mean checksum in plain, not function calculating it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
我用 C 创建了一段代码,然后运行了不到 2 分钟的暴力破解,得到了这个奇迹:
注意句子后面不能有任何字符(行尾等)。
您可以在这里检查:
http://www.crc-online.com.ar/index.php?d=The+CRC32+of+this+string+is+4A1C449B&en=Calular+CRC32
这个也很有趣:
源代码(抱歉有点乱)在这里: http://www.latinsud.com/pub/crc32 /
I created a piece of code in C, then ran bruteforce for less than 2 minutes and got this wonder:
Note the must be no characters (end of line, etc) after the sentence.
You can check it here:
http://www.crc-online.com.ar/index.php?d=The+CRC32+of+this+string+is+4A1C449B&en=Calcular+CRC32
This one is also fun:
Source code (sorry it's a little messy) here: http://www.latinsud.com/pub/crc32/
是的。 这是可能的,并且在简单的校验和中很常见。 让一个文件包含它自己的 md5sum 将是相当具有挑战性的。
在最基本的情况下,创建一个校验和值,这将导致总模数等于零。 然后,校验和函数将变得类似于“
如果校验和成为文件的一部分,并自行检查”。 一个非常常见的例子是信用卡号码中使用的 Luhn 算法。 最后一位数字是校验位,它本身也是 16 位数字的一部分。
Yes. It's possible, and it's common with simple checksums. Getting a file to include it's own md5sum would be quite challenging.
In the most basic case, create a checksum value which will cause the summed modulus to equal zero. The checksum function then becomes something like
If the checksum then becomes a part of the file, and is checked itself. A very common example of this is the Luhn algorithm used in credit card numbers. The last digit is a check digit, and is itself part of the 16 digit number.
检查一下:
Check this:
“我希望我的 crc32 是 802892ef...”
嗯,我认为这很有趣,所以今天我编写了一个小 java 程序来查找冲突。 我想我会把它留在这里以防有人发现它有用:
输出:
请注意消息末尾的点实际上是消息的一部分。
在我的 i5-2500 上,搜索从 00000000 到 ffffffff 的整个 crc32 空间大约需要 40 分钟,每秒执行大约 180 万次测试。 它已经耗尽了一个核心。
我对 java 还很陌生,所以任何对我的代码有建设性的评论将不胜感激。
“我的 crc32 是 c8cb204,而我得到的只是这件糟糕的 T 恤!”
"I wish my crc32 was 802892ef..."
Well, I thought this was interesting so today I coded a little java program to find collisions. Thought I'd leave it here in case someone finds it useful:
The output:
Note the dots at the end of the message are actually part of the message.
On my i5-2500 it was going to take ~40 minutes to search the whole crc32 space from 00000000 to ffffffff, doing about 1.8 million tests/second. It was maxing out one core.
I'm fairly new with java so any constructive comments on my code would be appreciated.
"My crc32 was c8cb204, and all I got was this lousy T-Shirt!"
当然,这是可能的。 但校验和的用途之一是检测文件的篡改 - 如果修改器也可以替换校验和,您如何知道文件是否已被修改?
Certainly, it is possible. But one of the uses of checksums is to detect tampering of a file - how would you know if a file has been modified, if the modifier can also replace the checksum?
当然,您可以将文件本身的摘要连接到文件末尾。 要检查它,您需要计算除最后一部分之外的所有部分的摘要,然后将其与最后一部分中的值进行比较。 当然,如果没有某种形式的加密,任何人都可以重新计算摘要并替换它。
编辑
我应该补充一点,这并不罕见。 一种技术是连接 CRC-32,以便整个文件(包括该摘要)的 CRC-32 为零。 不过,这不适用于基于加密哈希的摘要。
Sure, you could concatenate the digest of the file itself to the end of the file. To check it, you would calculate the digest of all but the last part, then compare it to the value in the last part. Of course, without some form of encryption, anyone can recalculate the digest and replace it.
edit
I should add that this is not so unusual. One technique is to concatenate a CRC-32 so that the CRC-32 of the whole file (including that digest) is zero. This won't work with digests based on cryptographic hashes, though.
python-stdnum 库中有一个 Luhn Mod N 算法的简洁实现(参见luhn.py)。
calc_check_digit
函数将计算一个数字或字符,当附加到文件(表示为字符串)时,将创建一个有效的Luhn Mod N
字符串。 正如上面许多答案中所指出的,这可以对文件的有效性进行健全性检查,但没有显着的防止篡改的安全性。 接收者需要知道使用什么字母来定义 Luhn mod N 有效性。There is a neat implementation of the
Luhn Mod N
algorithm in the python-stdnum library ( see luhn.py). Thecalc_check_digit
function will calculate a digit or character which, when appended to the file (expressed as a string) will create a validLuhn Mod N
string. As noted in many answers above, this gives a sanity check on the validity of the file, but no significant security against tampering. The receiver will need to know what alphabet is being used to define Luhn mod N validity.我不知道我是否正确理解你的问题,但你可以将文件的前 16 个字节作为文件其余部分的校验和。
所以在写入文件之前,先计算哈希值,先写入哈希值,然后再写入文件内容。
I don't know if I understand your question correctly, but you could make the first 16 bytes of the file the checksum of the rest of the file.
So before writing a file, you calculate the hash, write the hash value first and then write the file contents.
如果问题是询问文件是否可以包含自己的校验和(除了其他内容之外),那么对于固定大小的校验和,答案显然是肯定的,因为文件可以包含所有可能的校验和值。
如果问题是一个文件是否可以由它自己的校验和(而不是其他)组成,那么构造一个校验和算法来使这样的文件不可能是很简单的:对于 n 字节校验和,取文件的前 n 个字节的二进制表示形式并添加 1。由于构建始终对自身进行编码的校验和也很简单(即执行上述操作而不添加 1),显然有一些校验和可以可以对自己进行编码,还有一些不能。 可能很难判断标准校验和是哪一个。
If the question is asking whether a file can contain its own checksum (in addition to other content), the answer is trivially yes for fixed-size checksums, because a file could contain all possible checksum values.
If the question is whether a file could consist of its own checksum (and nothing else), it's trivial to construct a checksum algorithm that would make such a file impossible: for an n-byte checksum, take the binary representation of the first n bytes of the file and add 1. Since it's also trivial to construct a checksum that always encodes itself (i.e. do the above without adding 1), clearly there are some checksums that can encode themselves, and some that cannot. It would probably be quite difficult to tell which of these a standard checksum is.
当然可以,但在这种情况下,整个文件的 SHA 摘要将不是您包含的 SHA,因为它是一个加密哈希函数,因此更改文件中的单个位会更改整个哈希。 您正在寻找的是使用文件内容以匹配的方式计算的校验和一组标准。
You can of course, but in that case the SHA digest of the whole file will not be the SHA you included, because it is a cryptographic hash function, so changing a single bit in the file changes the whole hash. What you are looking for is a checksum calculated using the content of the file in way to match a set of criteria.
当然。
最简单的方法是通过 MD5 算法运行文件并将该数据嵌入到文件中。 如果您想尝试隐藏它,您可以拆分校验和并将其放置在文件的已知点(基于文件的部分大小,例如 30%、50%、75%)。
同样,您可以加密文件,或加密文件的一部分(以及 MD5 校验和)并将其嵌入到文件中。
编辑
我忘了说您需要在使用它之前删除校验和数据。
当然,如果您的文件需要易于被其他程序(例如 Word)读取,那么事情会变得有点复杂,因为您不想“损坏”该文件,使其不再可读。
Sure.
The simplest way would be to run the file through an MD5 algorithm and embed that data within the file. You can split up the check sum and place it at known points of the file (based on a portion size of the file e.g. 30%, 50%, 75%) if you wish to try and hide it.
Similarly you could encrypt the file, or encrypt a portion of the file (along with the MD5 checksum) and embed that in the file.
Edit
I forgot to say that you would need to remove the checksum data before using it.
Of course if your file needs to be readily readable by another program e.g. Word then things become a little more complicated as you don't want to "corrupt" the file so that it is no longer readable.