我试图找出 RAR 标头中的 crc32 字段的数据 恢复记录为依据。我正在尝试根据以前的 RAR 卷和提取的内容重新创建 RAR 卷。我已经达到了与正确/原始卷只有 12 个字节不同的程度。
这些名称基于 unrar 源代码 (arcread.cpp )或 RAR 技术说明。
RAR 文件由块组成。它们有一个标头和一个主体:
[header][body]
标头包含描述主体的元数据。这些块之一是 HEAD_TYPE=0x74 文件头(存档中的文件)。
[header:a...FILE_CRC...z][body]
字段 FILE_CRC(4 字节)是根据 [body] 中的所有可用数据计算的,[body] 是存储或压缩的文件。
恢复记录的块(HEAD_TYPE=0x7a 子块)与文件块非常相似,但它在标头中包含三个额外字段:
[header:a...FILE_CRC...z, "Protect+", rsc, dsc][body]
rsc: recovery sector count (4 bytes)
dsc: data sector count (8 bytes)
assert dsc*2 + rsc*512 == size([body])
您可能会认为该块的 FILE_CRC 是基于数据的在正文中,就像文件块一样,但事实并非如此。 (由其他人独立验证)
那么我的问题是,这个crc32是用什么数据来计算的呢?
我已经尝试过一些事情:
- 从 Protect+ 等开始。接下来是
- RR 子块开始之前的正文,
- 我已经暴力破解了一个小 RAR 文件上的所有可能范围。
I am trying to figure out on which data the crc32 field in the header of a RAR Recovery Record is based. I am trying to recreate a RAR volume based on a previous RAR volume and the extracted contents. I am up to the point where only 12 bytes differ from the correct/original volume.
The names are based on the unrar source code (arcread.cpp) or the RAR technote.
A RAR file consists of blocks. They have a header and a body:
[header][body]
The header contains metadata that describes the body. One of these blocks is HEAD_TYPE=0x74 File header (File in archive).
[header:a...FILE_CRC...z][body]
The field FILE_CRC (4 bytes) is calculated on all the data available in the [body], which is a stored or compressed file.
The block of a Recovery Record (HEAD_TYPE=0x7a subblock) is very similar to a file block, but it contains three extra fields in the header:
[header:a...FILE_CRC...z, "Protect+", rsc, dsc][body]
rsc: recovery sector count (4 bytes)
dsc: data sector count (8 bytes)
assert dsc*2 + rsc*512 == size([body])
You would think the FILE_CRC of this block is based on the data in the body, just like the file block, but this isn't the case. (verified independently by an other person)
So my question is, what data is used to calculate this crc32?
Some things I have tried already:
- starting from Protect+ ect. followed by the body
- everything before the start of the RR subblock
- I have brute-forced all possible ranges on a small RAR file.
发布评论
评论(1)
不使用默认种子(-0x1 或 0xFFFFFFFF):
删除 F(-0x10000000):
已向作者发送一封电子邮件,回复如下:
就像最初想到的那样,块的 FILE_CRC 是基于主体中的数据的。 RAR 代码中似乎有一个拼写错误。
TheUnarchiver2.7.1_src 的 XADRARParser.m 有以下注释代码:
差不多 3 年后我发现其他人已经找到了这个问题的解决方案 当年早些时候。
Instead of using the default seed (-0x1 or 0xFFFFFFFF):
an F was dropped (-0x10000000):
An email to the author was sent with the following response:
Like first thought, the FILE_CRC of the block is based on the data in the body. It looks as if there is a typo somewhere in the RAR code.
XADRARParser.m of TheUnarchiver2.7.1_src has the following commented code:
Almost 3 years later I found out that someone else had already found the solution to this problem earlier that year.