如何在 x86 汇编中编写自修改代码

发布于 2024-10-14 08:55:03 字数 567 浏览 3 评论 0原文

我正在考虑为我最近开发的一个业余爱好虚拟机编写一个 JIT 编译器。我了解一些汇编语言(我主要是一名 C 程序员。我可以阅读大多数汇编语言并参考我不理解的操作码,并编写一些简单的程序。)但是我很难理解这几个示例我在网上找到的自我修改代码。

这是一个这样的例子:http://asm.sourceforge.net/articles/smc.html

提供的示例程序在运行时做了大约四种不同的修改,其中没有一个被清楚地解释。 Linux 内核中断被多次使用,并且没有解释或详细说明。 (作者在调用中断之前将数据移动到几个寄存器中。我假设他正在传递参数,但这些参数根本没有解释,让读者猜测。)

我正在寻找的是最简单,最直接的例子在自修改程序的代码中。我可以查看并使用它来了解如何编写 x86 程序集中的自修改代码及其工作原理。您可以向我指出任何资源,或者您可以提供任何可以充分证明这一点的示例吗?

我使用 NASM 作为我的汇编器。

编辑:我也在 Linux 上运行此代码。

I'm looking at writing a JIT compiler for a hobby virtual machine I've been working on recently. I know a bit of assembly, (I'm mainly a C programmer. I can read most assembly with reference for opcodes I don't understand, and write some simple programs.) but I'm having a hard time understanding the few examples of self-modifying code I've found online.

This is one such example: http://asm.sourceforge.net/articles/smc.html

The example program provided does about four different modifications when run, none of which are clearly explained. Linux kernel interrupts are used several times, and aren't explained or detailed. (The author moved data into several registers before calling the interrupts. I assume he was passing arguments, but these arguments aren't explained at all, leaving the reader to guess.)

What I'm looking for is the simplest, most straightforward example in code of a self-modifying program. Something that I can look at, and use to understand how self-modifying code in x86 assembly has to be written, and how it works. Are there any resources you can point me to, or any examples you can give that would adequately demonstrate this?

I'm using NASM as my assembler.

EDIT: I'm also running this code on Linux.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

软糖 2024-10-21 08:55:03

哇,结果比我想象的要痛苦得多。 100% 的痛苦是 Linux 保护程序不被覆盖和/或执行数据。

两种解决方案如下所示。并且涉及到很多谷歌搜索,因此有些简单的放置一些指令字节并执行它们是我的,mprotect 和页面大小对齐是从谷歌搜索中剔除的,这是我必须为这个例子学习的东西。

自修改代码很简单,如果您获取程序或至少只是两个简单的函数,编译然后反汇编,您将获得这些指令的操作码。或者使用 nasm 编译汇编程序块等。由此我确定了将立即数加载到 eax 中然后返回的操作码。

理想情况下,您只需将这些字节放入某个内存中并执行该内存即可。要让 Linux 做到这一点,您必须更改保护,这意味着您必须向其发送一个在 mmap 页上对齐的指针。因此,分配比您需要的更多的内存,找到该分配中页面边界上的对齐地址,并对该地址进行保护,并使用该内存来放置操作码,然后执行。

第二个示例将现有函数编译到程序中,同样由于保护机制,您不能简单地指向它并更改字节,您必须取消对它的写入保护。因此,您必须使用该地址和足够的字节备份到先前的页面边界调用 mprotect 来覆盖要修改的代码。然后,您可以按照您想要的任何方式更改该函数的字节/操作码(只要您不溢出到您想要继续使用的任何函数中)并执行它。在本例中,您可以看到 fun() 有效,然后我将其更改为仅返回一个值,再次调用它,现在它已被修改。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

unsigned char * testfun;

unsigned int fun(unsigned int a) {
    return (a + 13);
}

unsigned int fun2(void) {
    return (13);
}

int main(void) {
    unsigned int ra;
    unsigned int pagesize;
    unsigned char * ptr;
    unsigned int offset;

    pagesize = getpagesize();
    testfun = malloc(1023 + pagesize + 1);
    if (testfun == NULL) return (1);
    //need to align the address on a page boundary
    printf("%p\n", testfun);
    testfun = (unsigned char * )(((long) testfun + pagesize - 1) & ~(pagesize - 1));
    printf("%p\n", testfun);

    if (mprotect(testfun, 1024, PROT_READ | PROT_EXEC | PROT_WRITE)) {
        printf("mprotect failed\n");
        return (1);
    }

    //400687: b8 0d 00 00 00          mov    $0xd,%eax
    //40068d: c3                      retq

    testfun[0] = 0xb8;
    testfun[1] = 0x0d;
    testfun[2] = 0x00;
    testfun[3] = 0x00;
    testfun[4] = 0x00;
    testfun[5] = 0xc3;

    ra = ((unsigned int( * )()) testfun)();
    printf("0x%02X\n", ra);

    testfun[0] = 0xb8;
    testfun[1] = 0x20;
    testfun[2] = 0x00;
    testfun[3] = 0x00;
    testfun[4] = 0x00;
    testfun[5] = 0xc3;

    ra = ((unsigned int( * )()) testfun)();
    printf("0x%02X\n", ra);

    printf("%p\n", fun);
    offset = (unsigned int)(((long) fun) & (pagesize - 1));
    ptr = (unsigned char * )((long) fun & (~(pagesize - 1)));

    printf("%p 0x%X\n", ptr, offset);

    if (mprotect(ptr, pagesize, PROT_READ | PROT_EXEC | PROT_WRITE)) {
        printf("mprotect failed\n");
        return (1);
    }

    //for(ra=0;ra<20;ra++) printf("0x%02X,",ptr[offset+ra]); printf("\n");

    ra = 4;
    ra = fun(ra);
    printf("0x%02X\n", ra);

    ptr[offset + 0] = 0xb8;
    ptr[offset + 1] = 0x22;
    ptr[offset + 2] = 0x00;
    ptr[offset + 3] = 0x00;
    ptr[offset + 4] = 0x00;
    ptr[offset + 5] = 0xc3;

    ra = 4;
    ra = fun(ra);
    printf("0x%02X\n", ra);

    return (0);
}

wow, this turned out to be a lot more painful than I expected. 100% of the pain was linux protecting the program from being overwritten and/or executing data.

Two solutions shown below. And a lot of googling was involved so the somewhat simple put some instruction bytes and execute them was mine, the mprotect and aligning on page size was culled from google searches, stuff I had to learn for this example.

The self modifying code is straight forward, if you take the program or at least just the two simple functions, compile and then disassemble you will get the opcodes for those instructions. or use nasm to compile blocks of assembler, etc. From this I determined the opcode to load an immediate into eax then return.

Ideally you simply put those bytes in some ram and execute that ram. To get linux to do that you have to change the protection, which means you have to send it a pointer that is aligned on a mmap page. So allocate more than you need, find the aligned address within that allocation that is on a page boundary and mprotect from that address and use that memory to put your opcodes and then execute.

the second example takes an existing function compiled into the program, again because of the protection mechanism you cannot simply point at it and change bytes, you have to unprotect it from writes. So you have to back up to the prior page boundary call mprotect with that address and enough bytes to cover the code to be modified. Then you can change the bytes/opcodes for that function in any way you want (so long as you don't spill over into any function you want to continue to use) and execute it. In this case you can see that fun() works, then I change it to simply return a value, call it again and now it has been modified.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

unsigned char * testfun;

unsigned int fun(unsigned int a) {
    return (a + 13);
}

unsigned int fun2(void) {
    return (13);
}

int main(void) {
    unsigned int ra;
    unsigned int pagesize;
    unsigned char * ptr;
    unsigned int offset;

    pagesize = getpagesize();
    testfun = malloc(1023 + pagesize + 1);
    if (testfun == NULL) return (1);
    //need to align the address on a page boundary
    printf("%p\n", testfun);
    testfun = (unsigned char * )(((long) testfun + pagesize - 1) & ~(pagesize - 1));
    printf("%p\n", testfun);

    if (mprotect(testfun, 1024, PROT_READ | PROT_EXEC | PROT_WRITE)) {
        printf("mprotect failed\n");
        return (1);
    }

    //400687: b8 0d 00 00 00          mov    $0xd,%eax
    //40068d: c3                      retq

    testfun[0] = 0xb8;
    testfun[1] = 0x0d;
    testfun[2] = 0x00;
    testfun[3] = 0x00;
    testfun[4] = 0x00;
    testfun[5] = 0xc3;

    ra = ((unsigned int( * )()) testfun)();
    printf("0x%02X\n", ra);

    testfun[0] = 0xb8;
    testfun[1] = 0x20;
    testfun[2] = 0x00;
    testfun[3] = 0x00;
    testfun[4] = 0x00;
    testfun[5] = 0xc3;

    ra = ((unsigned int( * )()) testfun)();
    printf("0x%02X\n", ra);

    printf("%p\n", fun);
    offset = (unsigned int)(((long) fun) & (pagesize - 1));
    ptr = (unsigned char * )((long) fun & (~(pagesize - 1)));

    printf("%p 0x%X\n", ptr, offset);

    if (mprotect(ptr, pagesize, PROT_READ | PROT_EXEC | PROT_WRITE)) {
        printf("mprotect failed\n");
        return (1);
    }

    //for(ra=0;ra<20;ra++) printf("0x%02X,",ptr[offset+ra]); printf("\n");

    ra = 4;
    ra = fun(ra);
    printf("0x%02X\n", ra);

    ptr[offset + 0] = 0xb8;
    ptr[offset + 1] = 0x22;
    ptr[offset + 2] = 0x00;
    ptr[offset + 3] = 0x00;
    ptr[offset + 4] = 0x00;
    ptr[offset + 5] = 0xc3;

    ra = 4;
    ra = fun(ra);
    printf("0x%02X\n", ra);

    return (0);
}
浅笑依然 2024-10-21 08:55:03

由于您正在编写 JIT 编译器,因此您可能不需要自修改代码,而是希望在运行时生成可执行代码。这是两件不同的事情。自修改代码是指在开始运行后进行修改的代码。自修改代码对现代处理器有很大的性能损失,因此对于 JIT 编译器来说是不受欢迎的。

在运行时生成可执行代码应该是一个简单的事情,只需使用 PROT_EXEC 和 PROT_WRITE 权限 mmap() 一些内存即可。您还可以对自己分配的某些内存调用 mprotect(),如上面 dwelch 所做的那样。

Since you're writing a JIT compiler, you probably don't want self-modifying code, you want to generate executable code at runtime. These are two different things. Self-modifying code is code that is modified after it has already started running. Self-modifying code has a large performance penalty on modern processors, and therefore would be undesirable for a JIT compiler.

Generating executable code at runtime should be a simple matter of mmap()ing some memory with PROT_EXEC and PROT_WRITE permissions. You could also call mprotect() on some memory you allocated yourself, as dwelch did above.

情仇皆在手 2024-10-21 08:55:03

我正在开发一个自修改游戏来教授 x86 汇编,并且必须解决这个确切的问题。我使用了以下三个库:

AsmJit + AsmTk进行组装: https://github.com/asmjit/asmjit + https://github.com/asmjit/asmtk
UDIS86反汇编:https://github.com/vmt/udis86

使用Udis86阅读说明,用户可以将它们编辑为字符串,然后使用 AsmJit/AsmTk 来组装新字节。这些可以写回内存,正如其他用户指出的那样,写回需要在 Windows 上使用 VirtualProtect 或在 Unix 上使用 mprotect 来修复内存页面权限。

对于 StackOverflow 来说,代码示例有点长,因此我将向您推荐我用代码示例编写的一篇文章:

https://medium.com/squallygame/how-we-wrote-a-self-hacking-game-in-c-d8b9f97bfa99

一个正常运行的仓库在这里(非常轻量级):

https://github.com/Squalr/SelfHackingApp< /a>

I'm working on a self-modifying game to teach x86 assembly, and had to solve this exact problem. I used the following three libraries:

AsmJit + AsmTk for assembling: https://github.com/asmjit/asmjit + https://github.com/asmjit/asmtk
UDIS86 for disassembling: https://github.com/vmt/udis86

Instructions are read with Udis86, the user can edit them as a string, and then AsmJit/AsmTk is used to assemble the new bytes. These can be written back to memory, and as other users have pointed out, the write-back requires using VirtualProtect on Windows or mprotect on Unix to fix the memory page permissions.

The code samples are a just a little long for StackOverflow, so I'll refer you to an article I wrote with code samples:

https://medium.com/squallygame/how-we-wrote-a-self-hacking-game-in-c-d8b9f97bfa99

A functioning repo is here (very light-weight):

https://github.com/Squalr/SelfHackingApp

清醇 2024-10-21 08:55:03

这是用 AT&T 汇编编写的。从程序的执行中可以看到,由于自修改代码,输出发生了变化。

编译: gcc -m32modify.smodify.c

使用 -m32 选项是因为该示例适用于 32 位机器

Aesembly:

.globl f4
.data     

f4:
    pushl %ebp       #standard function start
    movl %esp,%ebp

f:
    movl $1,%eax # moving one to %eax
    movl $0,f+1  # overwriting operand in mov instuction over
                 # the new immediate value is now 0. f+1 is the place
                 # in the program for the first operand.

    popl %ebp    # standard end
    ret

C 测试程序:

 #include <stdio.h>

 // assembly function f4
 extern int f4();
 int main(void) {
 int i;
 for(i=0;i<6;++i) {
 printf("%d\n",f4());
 }
 return 0;
 }

输出:

1
0
0
0
0
0

This is written in AT&T assembly. As you can see from the execution of the program, output has changed because of self-modifying code.

Compilation: gcc -m32 modify.s modify.c

the -m32 option is used because the example works on 32 bit machines

Aessembly:

.globl f4
.data     

f4:
    pushl %ebp       #standard function start
    movl %esp,%ebp

f:
    movl $1,%eax # moving one to %eax
    movl $0,f+1  # overwriting operand in mov instuction over
                 # the new immediate value is now 0. f+1 is the place
                 # in the program for the first operand.

    popl %ebp    # standard end
    ret

C test-program:

 #include <stdio.h>

 // assembly function f4
 extern int f4();
 int main(void) {
 int i;
 for(i=0;i<6;++i) {
 printf("%d\n",f4());
 }
 return 0;
 }

Output:

1
0
0
0
0
0
爱格式化 2024-10-21 08:55:03

基于上面的例子,有一个更简单的例子。感谢dwelch提供了很多帮助。

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>

char buffer [0x2000];
void* bufferp;

char* hola_mundo = "Hola mundo!";
void (*_printf)(const char*,...);

void hola()
{ 
    _printf(hola_mundo);
}

int main ( void )
{
    //Compute the start of the page
    bufferp = (void*)( ((unsigned long)buffer+0x1000) & 0xfffff000 );
    if(mprotect(bufferp, 1024, PROT_READ|PROT_EXEC|PROT_WRITE))
    {
        printf("mprotect failed\n");
        return(1);
    }
    //The printf function has to be called by an exact address
    _printf = printf;

    //Copy the function hola into buffer
    memcpy(bufferp,(void*)hola,60 //Arbitrary size);


    ((void (*)())bufferp)();  

    return(0);
}

A little bit simpler example based on the example above. Thanks to dwelch helped a lot.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>

char buffer [0x2000];
void* bufferp;

char* hola_mundo = "Hola mundo!";
void (*_printf)(const char*,...);

void hola()
{ 
    _printf(hola_mundo);
}

int main ( void )
{
    //Compute the start of the page
    bufferp = (void*)( ((unsigned long)buffer+0x1000) & 0xfffff000 );
    if(mprotect(bufferp, 1024, PROT_READ|PROT_EXEC|PROT_WRITE))
    {
        printf("mprotect failed\n");
        return(1);
    }
    //The printf function has to be called by an exact address
    _printf = printf;

    //Copy the function hola into buffer
    memcpy(bufferp,(void*)hola,60 //Arbitrary size);


    ((void (*)())bufferp)();  

    return(0);
}
过度放纵 2024-10-21 08:55:03

您还可以查看 GNU lighting 等项目。你给它一个简化的 RISC 型机器的代码,它会动态生成正确的机器。

您应该考虑的一个非常现实的问题是与外国图书馆的接口。您可能需要至少支持一些系统级调用/操作,虚拟机才能发挥作用。 Kitsune 的建议是让您考虑系统级调用的良好开端。您可能会使用 mprotect 来确保您修改的内存变得合法可执行。 (@KitsuneYMG)

一些允许调用用 C 编写的动态库的 FFI 应该足以隐藏许多操作系统特定的细节。所有这些问题都会对您的设计产生相当大的影响,因此最好尽早开始考虑它们。

You can also look at projects like GNU lightning. You give it code for a simplified RISC-type machine, and it generates correct machine dynamically.

A very real problem you should think about is interfacing with foreign libraries. You will probably need to support at least some system-level calls/operations for your VM to be useful. Kitsune's advice is a good start to get you thinking about system-level calls. You would probably use mprotect to ensure that the memory you have modified becomes legally executable. (@KitsuneYMG)

Some FFI allowing calls to dynamic libraries written in C should be sufficient to hide a lot of the OS specific details. All these issues can impact your design quite a bit, so it is best to start thinking about them early.

孤者何惧 2024-10-21 08:55:03

该问题标有“汇编”和“x86”,但没有标有“C”。虽然提出问题的人提到他们主要使用 C 语言,但寻找纯汇编解决方案的人(包括过去的我)很可能会遇到这个问题。因此,这是我对 JIT 程序最简单的演示的尝试,很大程度上受到 old_timer 答案的启发,但用纯汇编重写。

.bss
.align 4096 # page size on my machine. You can automate this process using
            # libc's getpagesize() to make it bit more portable, but hey!,
            # this is a minimum viable product!
exec: 
    .skip 10000




.text
mprotectoutput: .asciz "mprotect output value %d\n"

.global main
main:
    # prologue
    pushq %rbp
    movq %rsp, %rbp

    # body
    movq $exec, %rdi
    movq $10000, %rsi
    movq $7, %rdx
    call mprotect

    # print output from the mprotect function. If other than 0, the code will
    # segfault on `jmp *%rax`.
    movq $mprotectoutput, %rdi
    movq %rax, %rsi
    xor %rax, %rax
    call printf

    # the subroutine will move 0x45 to %rax, the return to the address
    # in register %r15

    # set the return address
    movq $back, %r15

    # rdi will be a counter that counts how many program bytes were written
    xor %rdi, %rdi
    # 48 c7 c0 45 00 00 00  mov    $0x45,%rax
    movq $0x0000000045c0c748, %rax
    movq %rax, exec(%rdi)
    addq $7, %rdi
    # 41 ff e7              jmp    *%r15
    movl $0x00e7ff41, %eax
    movl %eax, exec(%rdi)
    addq $3, %rdi

    movq $exec, %rax
    jmp *%rax

back:
    # epilogue
    movq %rbp, %rsp
    popq %rbp
    ret

This question is tagged with 'assembly' and 'x86' but not with 'C'. While the person who asked the question mentions they work mostly with C, this question is likely to be encountered by people looking for a pure assembly solution (including me in the past). Hence, this is my attempt at the simplest possible demonstration of a JIT program, heavily inspired by old_timer's answer but rewritten in pure assembly.

.bss
.align 4096 # page size on my machine. You can automate this process using
            # libc's getpagesize() to make it bit more portable, but hey!,
            # this is a minimum viable product!
exec: 
    .skip 10000




.text
mprotectoutput: .asciz "mprotect output value %d\n"

.global main
main:
    # prologue
    pushq %rbp
    movq %rsp, %rbp

    # body
    movq $exec, %rdi
    movq $10000, %rsi
    movq $7, %rdx
    call mprotect

    # print output from the mprotect function. If other than 0, the code will
    # segfault on `jmp *%rax`.
    movq $mprotectoutput, %rdi
    movq %rax, %rsi
    xor %rax, %rax
    call printf

    # the subroutine will move 0x45 to %rax, the return to the address
    # in register %r15

    # set the return address
    movq $back, %r15

    # rdi will be a counter that counts how many program bytes were written
    xor %rdi, %rdi
    # 48 c7 c0 45 00 00 00  mov    $0x45,%rax
    movq $0x0000000045c0c748, %rax
    movq %rax, exec(%rdi)
    addq $7, %rdi
    # 41 ff e7              jmp    *%r15
    movl $0x00e7ff41, %eax
    movl %eax, exec(%rdi)
    addq $3, %rdi

    movq $exec, %rax
    jmp *%rax

back:
    # epilogue
    movq %rbp, %rsp
    popq %rbp
    ret
两仪 2024-10-21 08:55:03

由于 https://nasm 可以编写比给出的解决方案简单得多的解决方案.us/doc/nasmdoc8.html#section-8.9.2,section 指令的 ELF 扩展。这允许您定义自定义部分,特别是可写和可执行的部分。基于这一见解,我写了这个(在 Linux amd64 上测试):

    ; Here is a trivial example of self-modifying code.
    ; The instruction at to_modify would print 'A', but 
    ; because of the instruction at label `modifier`, 
    ; the 'A' is replaced by a 'B'. While this isn't changing the op itself,
    ; it is however modifying a hard-coded argument (within a code section), 
    ; so I would say it counts.
    ;
    ; Many of the examples online segfaulted when I tried to run them, or
    ; just wouldn't compile. This example uses nasm's 
    ; section directives found on https://nasm.us/doc/nasmdoc8.html#section-8.9.2,
    ; which allows us to create a writable AND executable section, .textmodify.
    ; 
    ; Be warned; this code may mess up your computer, as it has not been tested on computers
    ; other than mine
    ;
    ; To compile:
    ; $ nasm -f elf self_modify.asm && ld -m elf_i386 -o self_modify self_modify.o && ./self_modify
    ; I am using nasm version 2.14.02.
    ; 
    ; The expected output sohuld be 
    ; Original Code
    ; Modified Code
    ; B
    ;
    ; Whereas if you comment out the line at modifier:
    ; Original code
    ; Modified code
    ; A
    ;
    ; Try improving this program by isolating the .textmodify section to code that will
    ; change (I haven't tried this yet). 
    ; 
    
    
    section .textmodify   progbits    alloc   exec    write   align=1
    global _start
    
    _start:
        ; Print "Original code"
        mov eax, 4
        mov ebx, 1
        mov ecx, msg1
        mov edx, len1
        int 0x80
    
        ; Modify the code
    modifier:
        mov dword [to_modify+1], 0x42  ; 
    
        ; Print "Modified code"
        mov eax, 4
        mov ebx, 1
        mov ecx, msg2
        mov edx, len2
        int 0x80
    
    modified_code:
        ; This instruction will be modified
        to_modify:
        ; This instruction is 
        ; b8 41 00 00 00 
        ; in binary. The first byte is the opcode for mov, the second is 
        ; the character code for 'A' in hex. Thus we replace [to_modify+1] with 0x42. 
        mov eax, 'A'
        nop
        
        ; Print the modified character
        push eax
        mov eax, 4
        mov ebx, 1
        mov ecx, esp
        mov edx, 1
        int 0x80
        pop eax
    
        ; Exit
        mov eax, 1
        xor ebx, ebx
        int 0x80
    
    section .data
        msg1 db "Original code", 10
        len1 equ $ - msg1
        msg2 db "Modified code", 10
        len2 equ $ - msg2

记住有关自修改代码的所有正常警告适用(它很危险,不安全,可能会烧毁你的房子......)

编辑:这个答案的先前版本说我们需要 1 字节对齐才能访问指令的任何部分。这是不正确的;该代码似乎适用于对齐值 1 和 16。

A much simpler solution than the ones given can be written due to https://nasm.us/doc/nasmdoc8.html#section-8.9.2, ELF extensions to the section directive. This allows you to define custom sections, and in particalar, one that is both writable and executable. Based on that insight, I wrote this (tested on Linux amd64):

    ; Here is a trivial example of self-modifying code.
    ; The instruction at to_modify would print 'A', but 
    ; because of the instruction at label `modifier`, 
    ; the 'A' is replaced by a 'B'. While this isn't changing the op itself,
    ; it is however modifying a hard-coded argument (within a code section), 
    ; so I would say it counts.
    ;
    ; Many of the examples online segfaulted when I tried to run them, or
    ; just wouldn't compile. This example uses nasm's 
    ; section directives found on https://nasm.us/doc/nasmdoc8.html#section-8.9.2,
    ; which allows us to create a writable AND executable section, .textmodify.
    ; 
    ; Be warned; this code may mess up your computer, as it has not been tested on computers
    ; other than mine
    ;
    ; To compile:
    ; $ nasm -f elf self_modify.asm && ld -m elf_i386 -o self_modify self_modify.o && ./self_modify
    ; I am using nasm version 2.14.02.
    ; 
    ; The expected output sohuld be 
    ; Original Code
    ; Modified Code
    ; B
    ;
    ; Whereas if you comment out the line at modifier:
    ; Original code
    ; Modified code
    ; A
    ;
    ; Try improving this program by isolating the .textmodify section to code that will
    ; change (I haven't tried this yet). 
    ; 
    
    
    section .textmodify   progbits    alloc   exec    write   align=1
    global _start
    
    _start:
        ; Print "Original code"
        mov eax, 4
        mov ebx, 1
        mov ecx, msg1
        mov edx, len1
        int 0x80
    
        ; Modify the code
    modifier:
        mov dword [to_modify+1], 0x42  ; 
    
        ; Print "Modified code"
        mov eax, 4
        mov ebx, 1
        mov ecx, msg2
        mov edx, len2
        int 0x80
    
    modified_code:
        ; This instruction will be modified
        to_modify:
        ; This instruction is 
        ; b8 41 00 00 00 
        ; in binary. The first byte is the opcode for mov, the second is 
        ; the character code for 'A' in hex. Thus we replace [to_modify+1] with 0x42. 
        mov eax, 'A'
        nop
        
        ; Print the modified character
        push eax
        mov eax, 4
        mov ebx, 1
        mov ecx, esp
        mov edx, 1
        int 0x80
        pop eax
    
        ; Exit
        mov eax, 1
        xor ebx, ebx
        int 0x80
    
    section .data
        msg1 db "Original code", 10
        len1 equ $ - msg1
        msg2 db "Modified code", 10
        len2 equ $ - msg2

Remember all of the normal caveats about self modifying code apply (it's dangerous, insecure, could burn down your house...)

EDIT: The previous version of this answer said that we needed 1 byte alignment in order to access any part of an instruction. This was incorrect; the code seems to work with an align value of both 1 and 16.

杯别 2024-10-21 08:55:03

我从未编写过自修改代码,尽管我对它的工作原理有基本的了解。基本上,您在内存中写入要执行的指令,然后跳转到那里。处理器解释您编写的指令的那些字节并(尝试)执行它们。例如,病毒和反复制程序可能会使用此技术。
关于系统调用,你是对的,参数是通过寄存器传递的。有关 Linux 系统调用及其参数的参考,请查看此处。

I've never written self-modifying code, although I have a basic understanding about how it works. Basically you write on memory the instructions you want to execute then jump there. The processor interpret those bytes you've written an instructions and (tries) to execute them. For example, viruses and anti-copy programs may use this technique.
Regarding the system calls, you were right, arguments are passed via registers. For a reference of linux system calls and their argument just check here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文