任何人都可以定义 Windows PE 校验和算法吗?

发布于 2024-11-16 15:26:37 字数 3767 浏览 0 评论 0原文

我想在 C# 中实现这个,

我在这里查看: http://www.codeproject.com/KB/cpp/PEChecksum.aspx

并且我知道ImageHlp.dll MapFileAndCheckSum 函数。

然而,由于种种原因,我想自己实现这一点。

我发现的最好的就在这里: 但是

​,我不明白这个解释。谁能解释一下校验和是如何计算的?

谢谢!

更新

从代码示例中,我不明白这意味着什么,以及如何将其翻译成 C#

sum -= sum < low 16 bits of CheckSum in file // 16-bit borrow 
sum -= low 16 bits of CheckSum in file 
sum -= sum < high 16 bits of CheckSum in file 
sum -= high 16 bits of CheckSum in file 

更新#2

谢谢,也遇到了一些类似的 Python 代码此处

    def generate_checksum(self):

    # This will make sure that the data representing the PE image
    # is updated with any changes that might have been made by
    # assigning values to header fields as those are not automatically
    # updated upon assignment.
    #
    self.__data__ = self.write()

    # Get the offset to the CheckSum field in the OptionalHeader
    #
    checksum_offset = self.OPTIONAL_HEADER.__file_offset__ + 0x40 # 64

    checksum = 0

    # Verify the data is dword-aligned. Add padding if needed
    #
    remainder = len(self.__data__) % 4
    data = self.__data__ + ( '\0' * ((4-remainder) * ( remainder != 0 )) )

    for i in range( len( data ) / 4 ):

        # Skip the checksum field
        #
        if i == checksum_offset / 4:
            continue

        dword = struct.unpack('I', data[ i*4 : i*4+4 ])[0]
        checksum = (checksum & 0xffffffff) + dword + (checksum>>32)
        if checksum > 2**32:
            checksum = (checksum & 0xffffffff) + (checksum >> 32)

    checksum = (checksum & 0xffff) + (checksum >> 16)
    checksum = (checksum) + (checksum >> 16)
    checksum = checksum & 0xffff

    # The length is the one of the original data, not the padded one
    #
    return checksum + len(self.__data__)

但是,它仍然对我不起作用 - 这是我对此代码的转换:

using System;
using System.IO;

namespace CheckSumTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var data = File.ReadAllBytes(@"c:\Windows\notepad.exe");

            var PEStart = BitConverter.ToInt32(data, 0x3c);
            var PECoffStart = PEStart + 4;
            var PEOptionalStart = PECoffStart + 20;
            var PECheckSum = PEOptionalStart + 64;
            var checkSumInFile = BitConverter.ToInt32(data, PECheckSum);
            Console.WriteLine(string.Format("{0:x}", checkSumInFile));

            long checksum = 0;

            var remainder = data.Length % 4;
            if (remainder > 0)
            {
                Array.Resize(ref data, data.Length + (4 - remainder));
            }

            var top = Math.Pow(2, 32);

            for (int i = 0; i < data.Length / 4; i++)
            {
                if (i == PECheckSum / 4)
                {
                    continue;
                }
                var dword = BitConverter.ToInt32(data, i * 4);
                checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
                if (checksum > top)
                {
                    checksum = (checksum & 0xffffffff) + (checksum >> 32);
                }
            }

            checksum = (checksum & 0xffff) + (checksum >> 16);
            checksum = (checksum) + (checksum >> 16);
            checksum = checksum & 0xffff;

            checksum += (uint)data.Length; 
            Console.WriteLine(string.Format("{0:x}", checksum));

            Console.ReadKey();
        }
    }
}

谁能告诉我我在哪里?我很蠢吗?

I would like to implement this in C#

I have looked here:
http://www.codeproject.com/KB/cpp/PEChecksum.aspx

And am aware of the ImageHlp.dll MapFileAndCheckSum function.

However, for various reasons, I would like to implement this myself.

The best I have found is here:
http://forum.sysinternals.com/optional-header-checksum-calculation_topic24214.html

But, I don't understand the explanation. Can anyone clarify how the checksum is calculated?

Thanks!

Update

I from the code example, I do not understand what this means, and how to translate it into C#

sum -= sum < low 16 bits of CheckSum in file // 16-bit borrow 
sum -= low 16 bits of CheckSum in file 
sum -= sum < high 16 bits of CheckSum in file 
sum -= high 16 bits of CheckSum in file 

Update #2

Thanks, came across some Python code that does similar too here

    def generate_checksum(self):

    # This will make sure that the data representing the PE image
    # is updated with any changes that might have been made by
    # assigning values to header fields as those are not automatically
    # updated upon assignment.
    #
    self.__data__ = self.write()

    # Get the offset to the CheckSum field in the OptionalHeader
    #
    checksum_offset = self.OPTIONAL_HEADER.__file_offset__ + 0x40 # 64

    checksum = 0

    # Verify the data is dword-aligned. Add padding if needed
    #
    remainder = len(self.__data__) % 4
    data = self.__data__ + ( '\0' * ((4-remainder) * ( remainder != 0 )) )

    for i in range( len( data ) / 4 ):

        # Skip the checksum field
        #
        if i == checksum_offset / 4:
            continue

        dword = struct.unpack('I', data[ i*4 : i*4+4 ])[0]
        checksum = (checksum & 0xffffffff) + dword + (checksum>>32)
        if checksum > 2**32:
            checksum = (checksum & 0xffffffff) + (checksum >> 32)

    checksum = (checksum & 0xffff) + (checksum >> 16)
    checksum = (checksum) + (checksum >> 16)
    checksum = checksum & 0xffff

    # The length is the one of the original data, not the padded one
    #
    return checksum + len(self.__data__)

However, it's still not working for me - here is my conversion of this code:

using System;
using System.IO;

namespace CheckSumTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var data = File.ReadAllBytes(@"c:\Windows\notepad.exe");

            var PEStart = BitConverter.ToInt32(data, 0x3c);
            var PECoffStart = PEStart + 4;
            var PEOptionalStart = PECoffStart + 20;
            var PECheckSum = PEOptionalStart + 64;
            var checkSumInFile = BitConverter.ToInt32(data, PECheckSum);
            Console.WriteLine(string.Format("{0:x}", checkSumInFile));

            long checksum = 0;

            var remainder = data.Length % 4;
            if (remainder > 0)
            {
                Array.Resize(ref data, data.Length + (4 - remainder));
            }

            var top = Math.Pow(2, 32);

            for (int i = 0; i < data.Length / 4; i++)
            {
                if (i == PECheckSum / 4)
                {
                    continue;
                }
                var dword = BitConverter.ToInt32(data, i * 4);
                checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
                if (checksum > top)
                {
                    checksum = (checksum & 0xffffffff) + (checksum >> 32);
                }
            }

            checksum = (checksum & 0xffff) + (checksum >> 16);
            checksum = (checksum) + (checksum >> 16);
            checksum = checksum & 0xffff;

            checksum += (uint)data.Length; 
            Console.WriteLine(string.Format("{0:x}", checksum));

            Console.ReadKey();
        }
    }
}

Can anyone tell me where I'm being stupid?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

偏爱你一生 2024-11-23 15:26:37

好吧,终于让它工作正常了...我的问题是我使用的是整数而不是单位!
所以,这段代码可以工作(假设数据是 4 字节对齐的,否则你必须稍微填充它) - 并且 PECheckSum 是 PE 中 CheckSum 值的位置(在计算校验和时显然不使用它! !!!)

请注意,以下是 C# 代码。

static uint CalcCheckSum(byte[] data, int PECheckSum)
{
    long checksum = 0;
    var top = Math.Pow(2, 32);

    for (var i = 0; i < data.Length / 4; i++)
    {
        if (i == PECheckSum / 4)
        {
            continue;
        }
        var dword = BitConverter.ToUInt32(data, i * 4);
        checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
        if (checksum > top)
        {
            checksum = (checksum & 0xffffffff) + (checksum >> 32);
        }
    }

    checksum = (checksum & 0xffff) + (checksum >> 16);
    checksum = (checksum) + (checksum >> 16);
    checksum = checksum & 0xffff;

    checksum += (uint)data.Length;
    return (uint)checksum;

}

Ok, finally got it working ok... my problem was that I was using ints not uints!!!
So, this code works (assuming data is 4-byte aligned, otherwise you'll have to pad it out a little) - and PECheckSum is the position of the CheckSum value within the PE (which is clearly not used when calculating the checksum!!!!)

Note that the following is C# code.

static uint CalcCheckSum(byte[] data, int PECheckSum)
{
    long checksum = 0;
    var top = Math.Pow(2, 32);

    for (var i = 0; i < data.Length / 4; i++)
    {
        if (i == PECheckSum / 4)
        {
            continue;
        }
        var dword = BitConverter.ToUInt32(data, i * 4);
        checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
        if (checksum > top)
        {
            checksum = (checksum & 0xffffffff) + (checksum >> 32);
        }
    }

    checksum = (checksum & 0xffff) + (checksum >> 16);
    checksum = (checksum) + (checksum >> 16);
    checksum = checksum & 0xffff;

    checksum += (uint)data.Length;
    return (uint)checksum;

}
亽野灬性zι浪 2024-11-23 15:26:37

论坛帖子中的代码与实际反汇编Windows PE代码时注意到的并不严格相同。 您引用的 CodeProject 文章给出了“将 32 位值折叠为 16 位” as:

mov edx,eax    ; EDX = EAX
shr edx,10h    ; EDX = EDX >> 16    EDX is high order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order
add eax,edx    ; EAX = EAX + EDX    High Order Folded into Low Order
mov edx,eax    ; EDX = EAX
shr edx,10h    ; EDX = EDX >> 16    EDX is high order
add eax,edx    ; EAX = EAX + EDX    High Order Folded into Low Order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order 16 bits  

你可以将其翻译成 C#:

// given: uint sum = ...;
uint high = sum >> 16; // take high order from sum
sum &= 0xFFFF;         // clear out high order from sum
sum += high;           // fold high order into low order

high = sum >> 16;      // take the new high order of sum
sum += high;           // fold the new high order into sum
sum &= 0xFFFF;         // mask to 16 bits

The code in the forum post is not strictly the same as what was noted during the actual disassembly of the Windows PE code. The CodeProject article you reference gives the "fold 32-bit value into 16 bits" as:

mov edx,eax    ; EDX = EAX
shr edx,10h    ; EDX = EDX >> 16    EDX is high order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order
add eax,edx    ; EAX = EAX + EDX    High Order Folded into Low Order
mov edx,eax    ; EDX = EAX
shr edx,10h    ; EDX = EDX >> 16    EDX is high order
add eax,edx    ; EAX = EAX + EDX    High Order Folded into Low Order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order 16 bits  

Which you could translate into C# as:

// given: uint sum = ...;
uint high = sum >> 16; // take high order from sum
sum &= 0xFFFF;         // clear out high order from sum
sum += high;           // fold high order into low order

high = sum >> 16;      // take the new high order of sum
sum += high;           // fold the new high order into sum
sum &= 0xFFFF;         // mask to 16 bits
善良天后 2024-11-23 15:26:37

没有人真正回答最初的问题“任何人都可以定义 Windows PE 校验和算法吗?”所以我将尽可能简单地定义它。到目前为止给出的许多示例都是针对无符号 32 位整数(也称为 DWORD)进行优化,但如果您只想从最基本的角度了解算法本身,那么就是这样:

  1. 使用无符号 16 位整数整数(也称为 WORD)来存储校验和,将除 PE 可选标头校验和的 4 个字节之外的所有数据 WORD 相加。如果文件不是字对齐的,那么最后一个字节是 0x00。

  2. 将校验和从 WORD 转换为 DWORD 并添加文件的大小。

上述 PE 校验和算法实际上与原始 MS-DOS 校验和算法相同。唯一的区别是跳过的位置并替换末尾的 XOR 0xFFFF 并添加文件的大小。

从我的 PHP 的 WinPEFile 类,上面的算法看起来像:

    $x = 0;
    $y = strlen($data);
    $val = 0;
    while ($x < $y)
    {
        // Skip the checksum field location.
        if ($x === $this->pe_opt_header["checksum_pos"])  $x += 4;
        else
        {
            $val += self::GetUInt16($data, $x, $y);

            // In PHP, integers are either signed 32-bit or 64-bit integers.
            if ($val > 0xFFFF)  $val = ($val & 0xFFFF) + 1;
        }
    }

    // Add the file size.
    $val += $y;

No one really answered the original question of "Can anyone define the Windows PE Checksum Algorithm?" so I'm going to define it as simply as possible. A lot of the examples given so far are optimizing for unsigned 32-bit integers (aka DWORDs), but if you just want to understand the algorithm itself at its most fundamental, it is simply this:

  1. Using an unsigned 16-bit integer (aka a WORD) to store the checksum, add up all of the WORDs of the data except for the 4 bytes of the PE optional header checksum. If the file is not WORD-aligned, then the last byte is a 0x00.

  2. Convert the checksum from a WORD to a DWORD and add the size of the file.

The PE checksum algorithm above is effectively the same as the original MS-DOS checksum algorithm. The only differences are the location to skip and replacing the XOR 0xFFFF at the end and adding the size of the file instead.

From my WinPEFile class for PHP, the above algorithm looks like:

    $x = 0;
    $y = strlen($data);
    $val = 0;
    while ($x < $y)
    {
        // Skip the checksum field location.
        if ($x === $this->pe_opt_header["checksum_pos"])  $x += 4;
        else
        {
            $val += self::GetUInt16($data, $x, $y);

            // In PHP, integers are either signed 32-bit or 64-bit integers.
            if ($val > 0xFFFF)  $val = ($val & 0xFFFF) + 1;
        }
    }

    // Add the file size.
    $val += $y;
执手闯天涯 2024-11-23 15:26:37

以下来自 emmanuel 的 Java 代码可能无法工作。就我而言,它挂起并且未完成。我相信这是由于代码中大量使用 IO:特别是 data.read()。这可以与数组交换作为解决方案。 RandomAccessFile 完全或增量地将文件读取到字节数组中。

我尝试了此操作,但由于校验和偏移的条件是跳过校验和标头字节,因此计算速度太慢。我想OP的C#解决方案也会有类似的问题。

下面的代码也删除了这个。

public static long computeChecksum(RandomAccessFile data, int checksumOffset)
                throws IOException {
        
        ...
        byte[] barray = new byte[(int) length];     
        data.readFully(barray);
        
        long i = 0;
        long ch1, ch2, ch3, ch4, dword;
        
        while (i < checksumOffset) {
            
            ch1 = ((int) barray[(int) i++]) & 0xff;
            ...
            
            checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);

            if (checksum > top) {
                checksum = (checksum & 0xffffffffL) + (checksum >> 32);
            }
        }
        i += 4;
        
        while (i < length) {
            
            ch1 = ((int) barray[(int) i++]) & 0xff;
            ...
    
            checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);
            
            if (checksum > top) {
                checksum = (checksum & 0xffffffffL) + (checksum >> 32);
            }
        }
        
        checksum = (checksum & 0xffff) + (checksum >> 16);
        checksum = checksum + (checksum >> 16);
        checksum = checksum & 0xffff;
        checksum += length;
        
        return checksum;
    }

然而,我仍然认为代码过于冗长和笨重,因此我用通道替换了 raf,并将罪魁祸首字节重写为零以消除条件。这段代码可能仍然可以使用缓存样式缓冲读取。

public static long computeChecksum2(FileChannel ch, int checksumOffset)
            throws IOException {
    
    ch.position(0);
    long sum = 0;
    long top = (long) Math.pow(2, 32);
    long length = ch.size();
    
    ByteBuffer buffer = ByteBuffer.wrap(new byte[(int) length]);
    buffer.order(ByteOrder.LITTLE_ENDIAN);
    
    ch.read(buffer);
    buffer.putInt(checksumOffset, 0x0000);
    
    buffer.position(0);
    while (buffer.hasRemaining()) {
        sum += buffer.getInt() & 0xffffffffL;
        if (sum > top) {
            sum = (sum & 0xffffffffL) + (sum >> 32);
        }
    }   
    sum = (sum & 0xffff) + (sum >> 16);
    sum = sum + (sum >> 16);
    sum = sum & 0xffff;
    sum += length;
    
    return sum;
}

Java code below from emmanuel may not work. In my case it hangs and does not complete. I believe this is due to the heavy use of IO in the code: in particular the data.read()'s. This can be swapped with an array as solution. Where the RandomAccessFile fully or incrementally reads the file into a byte array(s).

I attempted this but the calculation was too slow due to the conditional for the checksum offset to skip the checksum header bytes. I would imagine that the OP's C# solution would have a similar problem.

The below code removes this also.

public static long computeChecksum(RandomAccessFile data, int checksumOffset)
                throws IOException {
        
        ...
        byte[] barray = new byte[(int) length];     
        data.readFully(barray);
        
        long i = 0;
        long ch1, ch2, ch3, ch4, dword;
        
        while (i < checksumOffset) {
            
            ch1 = ((int) barray[(int) i++]) & 0xff;
            ...
            
            checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);

            if (checksum > top) {
                checksum = (checksum & 0xffffffffL) + (checksum >> 32);
            }
        }
        i += 4;
        
        while (i < length) {
            
            ch1 = ((int) barray[(int) i++]) & 0xff;
            ...
    
            checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);
            
            if (checksum > top) {
                checksum = (checksum & 0xffffffffL) + (checksum >> 32);
            }
        }
        
        checksum = (checksum & 0xffff) + (checksum >> 16);
        checksum = checksum + (checksum >> 16);
        checksum = checksum & 0xffff;
        checksum += length;
        
        return checksum;
    }

I still however think that code was too verbose and clunky so I swapped out the raf with a channel and rewrote the culprit bytes to zero's to eliminate the conditional. This code could still probably do with a cache style buffered read.

public static long computeChecksum2(FileChannel ch, int checksumOffset)
            throws IOException {
    
    ch.position(0);
    long sum = 0;
    long top = (long) Math.pow(2, 32);
    long length = ch.size();
    
    ByteBuffer buffer = ByteBuffer.wrap(new byte[(int) length]);
    buffer.order(ByteOrder.LITTLE_ENDIAN);
    
    ch.read(buffer);
    buffer.putInt(checksumOffset, 0x0000);
    
    buffer.position(0);
    while (buffer.hasRemaining()) {
        sum += buffer.getInt() & 0xffffffffL;
        if (sum > top) {
            sum = (sum & 0xffffffffL) + (sum >> 32);
        }
    }   
    sum = (sum & 0xffff) + (sum >> 16);
    sum = sum + (sum >> 16);
    sum = sum & 0xffff;
    sum += length;
    
    return sum;
}
只想待在家 2024-11-23 15:26:37

CheckSum 字段长度为 32 位,计算方式如下

1. 将整个文件的所有 dword(32 位片段)添加到总和中

将整个文件的所有 dword 添加不包括 CheckSum 字段本身,包括所有标题和所有内容,为一个双字。如果双字溢出,则将溢出的位添加回双字的第一位 (2^0)。
如果文件不能完全分为双字(4 位块),请参阅 2。

我知道实现这一点的最佳方法是使用 GNU C 编译器整数溢出内置函数 __builtin_uadd_overflow
在 Jeffrey 记录的原始 ChkSum 函数 中沃尔顿总和
通过执行 add (%esi),%eax 计算得出,其中
esi 包含文件的基址,eax 为 0,并像这样添加文件的其余部分

adc 0x4(%esi),%eax
adc 0x8(%esi),%eax
adc 0xc(%esi),%eax
adc 0x10(%esi),%eax
...
adc $0x0,%eax

第一个 add 添加第一个 dword 忽略任何进位标志。下一个双字
adc 指令添加,该指令的作用与 add 相同,但是
添加在执行指令之前设置的任何进位标志
到被加数。最后一个 adc $0x0,%eax 仅添加最后一个进位标志,如果
已设置且无法丢弃。

请记住,不应添加 CheckSum 字段本身的双字。

2. 如果有一个,则将余数添加到总和中

如果文件不能完全整除为双字,则将余数添加为
零填充双字。例如:假设您的文件有 15 个字节长,如下所示
<代码> 0E 1F BA 0E | 00 B4 09 CD | 21 B8 01 4C | CD 21 54
您需要将余数作为 0x005421CD 添加到总和中。我的系统是
小端系统。我不知道校验和是否会因为
而改变
大端系统上的字节顺序,或者您只需模拟这个
行为。
我通过将 buffer_size 四舍五入到可被 4 整除的下一个字节数来实现此目的
没有余数或换句话说:表示下一个完整的双字计数
以字节为单位。然后我使用 calloc 进行分配,因为它会初始化内存块
全为零。

if(buffer_size%4) {
  buffer_size+=4-(buffer_size%4);
...
calloc(buffer_size,1)

3. 将总和的低位字(16 位片)和高位字相加。

sum=(sum&0xffff)+(sum>>16);

4. 再次添加新的较高位单词

sum+=(sum>>16);

5只保留低位字

sum&=0xffff;

6. 将文件中的字节数与 sum 相加

return(sum+size);

就是这样我写的。它不是C#,而是C。off_t size是文件中的字节数。 uint32_t *base 是指向加载到内存中的文件的指针。内存块应该在下一个可被 4 整除的字节数末尾用零填充。

uint32_t pe_header_checksum(uint32_t *base,off_t size) {
  uint32_t sum=0;
  off_t i;
  for(i=0;i<(size/4);i++) {
    if(i==0x36) continue;
    sum+=__builtin_uadd_overflow(base[i],sum,&sum);
  }
  if(size%4) sum+=base[i];
  sum=(sum&0xffff)+(sum>>16);
  sum+=(sum>>16);
  sum&=0xffff;
  return(sum+size);
}

如果您愿意,您可以查看正在运行的代码并阅读更多内容 此处

The CheckSum field is 32 bits long and is calculated as follows

1. Add all dwords (32 bit pieces) of the entire file to a sum

Add all dwords of the entire file not including the CheckSum field itself, including all headers and all of the contents, to a dword. If the dword overflows, add the overflowed bit back to the first bit (2^0) of the dword.
If the file is not entirely divisible into dwords (4 bit pieces) see 2.

The best way I know to realize this is by using the GNU C Compilers Integer Overflow Builtin function __builtin_uadd_overflow.
In the original ChkSum function documented by Jeffrey Walton the sum
was calculated by performing an add (%esi),%eax where
esi contains the base address of the file and eax is 0 and adding the rest of the file like this

adc 0x4(%esi),%eax
adc 0x8(%esi),%eax
adc 0xc(%esi),%eax
adc 0x10(%esi),%eax
...
adc $0x0,%eax

The first add adds the first dword ignoring any carry flag. The next dwords
are added by the adc instruction which does the same thing as add but
adds any carry flag that was set before executing the instruction in addition
to the summand. The last adc $0x0,%eax adds only the last carry flag if it
was set and cannot be discarded.

Please keep in mind that the dword of CheckSum field itself should not be added.

2. Add the remainder to the sum if there is one

If the file is not entirely divisible into dwords, add the remainder as a
zero-padded dword. For example: say your file is 15 bytes long and looks like this
0E 1F BA 0E | 00 B4 09 CD | 21 B8 01 4C | CD 21 54
You need to add the remainder as 0x005421CD to the sum. My system is a
little-endian system. I do not know if the checksum would change because of the
this order of the bytes on big-endian systems, or you would just simulate this
behaviour.
I do this by rounding up the buffer_size to the next bytecount divisible by 4
without remainder or put differently: the next whole dword count represented
in bytes. Then I allocate with calloc because it initializes the memory block
with all zeros.

if(buffer_size%4) {
  buffer_size+=4-(buffer_size%4);
...
calloc(buffer_size,1)

3. Add the lower word (16 bit piece) and the higher word of the sum together.

sum=(sum&0xffff)+(sum>>16);

4. Add the new higher word once again

sum+=(sum>>16);

5. Only keep the lower word

sum&=0xffff;

6. Add the number of bytes in the file to the sum

return(sum+size);

This is how I wrote it. It is not C#, but C. off_t size is the number of bytes in the file. uint32_t *base is a pointer to the file loaded into memory. The block of memory should be padded with zeros at the end to the next bytecount divisible by 4.

uint32_t pe_header_checksum(uint32_t *base,off_t size) {
  uint32_t sum=0;
  off_t i;
  for(i=0;i<(size/4);i++) {
    if(i==0x36) continue;
    sum+=__builtin_uadd_overflow(base[i],sum,&sum);
  }
  if(size%4) sum+=base[i];
  sum=(sum&0xffff)+(sum>>16);
  sum+=(sum>>16);
  sum&=0xffff;
  return(sum+size);
}

If you want you can see the code in action and read a little bit more here.

如梦亦如幻 2024-11-23 15:26:37

Java 示例并不完全正确。以下 Java 实现与 Microsoft 来自 Imagehlp.MapFileAndCheckSumA 的原始实现的结果相对应。

重要的是输入字节被 inputByte & 屏蔽。 0xff 以及生成的 long 在与 currentWord & 的加法项中使用时再次被屏蔽。 0xffffffffL(考虑L):

    long checksum = 0;
    final long max = 4294967296L; // 2^32

    // verify the data is DWORD-aligned and add padding if needed
    final int remainder = data.length % 4;
    final byte[] paddedData = Arrays.copyOf(data, data.length
            + (remainder > 0 ? 4 - remainder : 0));

    for (int i = 0; i <= paddedData.length - 4; i += 4)
    {
        // skip the checksum field
        if (i == this.offsetToOriginalCheckSum)
            continue;

        // take DWORD into account for computation
        final long currentWord = (paddedData[i] & 0xff)
                               + ((paddedData[i + 1] & 0xff) << 8)
                               + ((paddedData[i + 2] & 0xff) << 16)
                               + ((paddedData[i + 3] & 0xff) << 24);

        checksum = (checksum & 0xffffffffL) + (currentWord & 0xffffffffL);

        if (checksum > max)
            checksum = (checksum & 0xffffffffL) + (checksum >> 32);
    }

    checksum = (checksum & 0xffff) + (checksum >> 16);
    checksum = checksum + (checksum >> 16);
    checksum = checksum & 0xffff;
    checksum += data.length; // must be original data length

在这种情况下,Java有点不方便。

The Java example is not entirely correct. Following Java implementation corresponds with the result of Microsoft's original implementation from Imagehlp.MapFileAndCheckSumA.

It's important that the input bytes are getting masked with inputByte & 0xff and the resulting long masked again when it's used in the addition term with currentWord & 0xffffffffL (consider the L):

    long checksum = 0;
    final long max = 4294967296L; // 2^32

    // verify the data is DWORD-aligned and add padding if needed
    final int remainder = data.length % 4;
    final byte[] paddedData = Arrays.copyOf(data, data.length
            + (remainder > 0 ? 4 - remainder : 0));

    for (int i = 0; i <= paddedData.length - 4; i += 4)
    {
        // skip the checksum field
        if (i == this.offsetToOriginalCheckSum)
            continue;

        // take DWORD into account for computation
        final long currentWord = (paddedData[i] & 0xff)
                               + ((paddedData[i + 1] & 0xff) << 8)
                               + ((paddedData[i + 2] & 0xff) << 16)
                               + ((paddedData[i + 3] & 0xff) << 24);

        checksum = (checksum & 0xffffffffL) + (currentWord & 0xffffffffL);

        if (checksum > max)
            checksum = (checksum & 0xffffffffL) + (checksum >> 32);
    }

    checksum = (checksum & 0xffff) + (checksum >> 16);
    checksum = checksum + (checksum >> 16);
    checksum = checksum & 0xffff;
    checksum += data.length; // must be original data length

In this case, Java is a bit inconvenient.

生生漫 2024-11-23 15:26:37

我试图用Java解决同样的问题。以下是将 Mark 的解决方案翻译成 Java,使用 RandomAccessFile 而不是字节数组作为输入:

static long computeChecksum(RandomAccessFile data, long checksumOffset) throws IOException {
    long checksum = 0;
    long top = (long) Math.pow(2, 32);
    long length = data.length();

    for (long i = 0; i < length / 4; i++) {
        if (i == checksumOffset / 4) {
            data.skipBytes(4);
            continue;
        }

        long ch1 = data.read();
        long ch2 = data.read();
        long ch3 = data.read();
        long ch4 = data.read();

        long dword = ch1 + (ch2 << 8) + (ch3 << 16) + (ch4 << 24);

        checksum = (checksum & 0xffffffffL) + dword + (checksum >> 32);

        if (checksum > top) {
            checksum = (checksum & 0xffffffffL) + (checksum >> 32);
        }
    }

    checksum = (checksum & 0xffff) + (checksum >> 16);
    checksum = checksum + (checksum >> 16);
    checksum = checksum & 0xffff;
    checksum += length;

    return checksum;
}

I was trying to solve the same issue in Java. Here is Mark's solution translated into Java, using a RandomAccessFile instead of a byte array as input:

static long computeChecksum(RandomAccessFile data, long checksumOffset) throws IOException {
    long checksum = 0;
    long top = (long) Math.pow(2, 32);
    long length = data.length();

    for (long i = 0; i < length / 4; i++) {
        if (i == checksumOffset / 4) {
            data.skipBytes(4);
            continue;
        }

        long ch1 = data.read();
        long ch2 = data.read();
        long ch3 = data.read();
        long ch4 = data.read();

        long dword = ch1 + (ch2 << 8) + (ch3 << 16) + (ch4 << 24);

        checksum = (checksum & 0xffffffffL) + dword + (checksum >> 32);

        if (checksum > top) {
            checksum = (checksum & 0xffffffffL) + (checksum >> 32);
        }
    }

    checksum = (checksum & 0xffff) + (checksum >> 16);
    checksum = checksum + (checksum >> 16);
    checksum = checksum & 0xffff;
    checksum += length;

    return checksum;
}
っ左 2024-11-23 15:26:37
private unsafe static int GetSetPEChecksum(byte[] Array) {
    var Value = 0;
    var Count = Array.Length;
    if(Count >= 64)
        fixed (byte* array = Array) {
            var Index = 0;
            var Coff = *(int*)(array + 60);
            if(Coff >= 64 && Count >= Coff + 92) {
                *(int*)(array + Coff + 88) = 0;
                var Bound = Count >> 1;
                if((Count & 1) != 0) Value = array[Count & ~1];
                var Short = (ushort*)array;
                while(Index < Bound) {
                    Value += Short[Index++];
                    Value = (Value & 0xffff) + (Value >> 16);
                    Value = (Value + (Value >> 16)) & 0xffff;
                }
                *(int*)(array + Coff + 88) = Value += Count;
            }
        }
    return Value;
}

如果您需要短的不安全...(不需要使用 Double 和 Long 整数,也不需要算法内部的数组对齐)

private unsafe static int GetSetPEChecksum(byte[] Array) {
    var Value = 0;
    var Count = Array.Length;
    if(Count >= 64)
        fixed (byte* array = Array) {
            var Index = 0;
            var Coff = *(int*)(array + 60);
            if(Coff >= 64 && Count >= Coff + 92) {
                *(int*)(array + Coff + 88) = 0;
                var Bound = Count >> 1;
                if((Count & 1) != 0) Value = array[Count & ~1];
                var Short = (ushort*)array;
                while(Index < Bound) {
                    Value += Short[Index++];
                    Value = (Value & 0xffff) + (Value >> 16);
                    Value = (Value + (Value >> 16)) & 0xffff;
                }
                *(int*)(array + Coff + 88) = Value += Count;
            }
        }
    return Value;
}

If you need short unsafe... (Not need use Double and Long integers and not need Array aligning inside algorithm)

终止放荡 2024-11-23 15:26:37

我想通过 PE 校验和“揭开”整个故事的神秘面纱来澄清这一情况。
校验和算法如下:

  • 加载 PE 映像文件
  • 确定 CheckSum 结构成员的绝对偏移量(uint32_t,PE/32 位和 PE/64 位不同)
  • Set sum := 0
  • 添加映像中除uint32_t 包含 CheckSum 本身到 'sum' WITH CARRY
  • 最后将 32bit 宽的 PE 图像大小添加到 sum,输出 CheckSum

有没有任何“折叠”。这是考虑到任何溢出的简单有限加法。就 x86 CPU 指令而言,这是“adc”(带进位的加法)助记符,而不是“add”助记符。

在普通 C 中,没有任何编译器内在函数,这是:

#include <stdint.h>
#include <byteswap.h>

#ifdef _BIG_ENDIAN_MACHINE
#define csum_bswap_16(_x) bswap_16(_x)
#else // Little Endian
#define csum_bswap_16(_x) (_x)
#endif

// IMPORTANT NOTE: DO provide one extra zero byte after the image if its
// --------------- size is ODD!!!
uint32_t compute_pe_checksum_16bit ( const uint8_t *image_data, uint32_t image_size, uint32_t csum_ofs )
{
  uint32_t          i;
  register uint16_t csum = 0, carry_check;

  for (i = 0; i < csum_ofs; i += sizeof(uint16_t) )
  {
    carry_check = csum;
    csum += csum_bswap_16( *((uint16_t*)(image_data + i)) );
    csum += (csum < carry_check) ? 1 : 0;
  }

  for (i = csum_ofs + sizeof(uint32_t); i < image_size; i += sizeof(uint16_t))
  {
    carry_check = csum;
    csum += csum_bswap_16( *((uint16_t*)(image_data + i)) );
    csum += (csum < carry_check) ? 1 : 0;
  }

  return ((uint32_t)csum) + image_size;
}

我将整个循环分成两部分(在 CheckSum 结构元素之前和之后,必须排除该元素)以使 CPU 的分支预测单元满意(否则,它会检查每次循环迭代中的 CheckSum 偏移量)。

面临溢出(即CPU进位为1)仅意味着:“如果计算x+y并且结果小于之前的总和(此处:carry_check),则进位为1”。

证明此实现正确的证据很简单:使用 Windows 安装中的“user32.dll”并计算校验和(这就是我为测试所做的)。

一些附加说明:

  • 在 Big Endian 机器上,您必须对要添加的 uint16_t 值进行字节交换;
  • 如果您修改此函数以使用 uint32_t 而不是 uint16_t(如上面的一些答案/评论所述),那么这会产生完全不同的校验和;
  • 这种非常古老的(MS-DOS 风格)算法至今仍在使用:UEFI BIOS 加载 EFI 二进制文件,它们是 PE/COFF 可执行文件。可能存在验证校验和的 UEFI 实现;
  • 如果您正在寻找优化的实现,并且您的系统配备了 SIMD(单指令多数据)单元(现在几乎所有 CPU 都配备了该单元),那么您可以完全并行地添加 8 个 16 位值(假设 128 位 SIMD 寄存器)并带有进位,有时称为“水平添加”。

I would like to clarify the situation with the PE checksum 'de-mystifying' the whole story.
The checksum algorithm is just like this:

  • Load the PE image file
  • Determine the absolute offset of the CheckSum structure member (an uint32_t, differs for PE/32bit and PE/64bit)
  • Set sum := 0
  • Add all words of the image EXCEPT FOR the uint32_t containing the CheckSum itself to 'sum' WITH CARRY
  • Finally, add the 32bit-wide PE image size to sum, output CheckSum

There is no 'folding' whatsoever. This is simple finite addition taking any overflows into account. In terms of x86 CPU instructions, this is the 'adc' (add with carry) mnemonic instead of the 'add' mnemonic.

In plain C, without any compiler intrinsics, this is:

#include <stdint.h>
#include <byteswap.h>

#ifdef _BIG_ENDIAN_MACHINE
#define csum_bswap_16(_x) bswap_16(_x)
#else // Little Endian
#define csum_bswap_16(_x) (_x)
#endif

// IMPORTANT NOTE: DO provide one extra zero byte after the image if its
// --------------- size is ODD!!!
uint32_t compute_pe_checksum_16bit ( const uint8_t *image_data, uint32_t image_size, uint32_t csum_ofs )
{
  uint32_t          i;
  register uint16_t csum = 0, carry_check;

  for (i = 0; i < csum_ofs; i += sizeof(uint16_t) )
  {
    carry_check = csum;
    csum += csum_bswap_16( *((uint16_t*)(image_data + i)) );
    csum += (csum < carry_check) ? 1 : 0;
  }

  for (i = csum_ofs + sizeof(uint32_t); i < image_size; i += sizeof(uint16_t))
  {
    carry_check = csum;
    csum += csum_bswap_16( *((uint16_t*)(image_data + i)) );
    csum += (csum < carry_check) ? 1 : 0;
  }

  return ((uint32_t)csum) + image_size;
}

I have split the entire loop into two parts (before and after the CheckSum structure element, which has to be excluded) to make the CPU's branch prediction unit happy (otherwise, it would have to check for the CheckSum offset in every loop iteration).

Facing an overflow (i.e. CPU carry bit is 1) just means: 'If you compute x+y and the result is lesser than the previous sum (here: carry_check), then carry bit is 1'.

The proof that this implementation is correct, is simple: Use 'user32.dll' from your Windows installation and compute the checksum (that is what I did for testing).

Some additional remarks:

  • on a Big Endian machine, you have to byte-swap the to-be-added uint16_t values;
  • if you modify this function to use uint32_t's instead of uint16_t's (as mentioned by some answers/comments above), then this yields completely different checksums;
  • this very old (MS-DOS-style) algorithm is still used today: UEFI BIOS' load EFI binaries, which are PE/COFF executables. There might be UEFI implementations that verify the checksum;
  • if you are looking for an optimized implementation and your system is equipped with a SIMD (Single Instruction Multiple Data) unit (almost all CPUs have this today), then you could add eight 16bit values totally in parallel (assuming 128bit SIMD registers) with carry, sometimes called 'horizontal add'.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文