如何用C语言编写自修改代码？

发布于 2024-12-04 17:26:22 字数 694 浏览 3 评论 0原文

我想编写一段不断改变自身的代码，即使改变微不足道。

例如，可能类似于

for i in 1 to  100, do 
begin
   x := 200
   for j in 200 downto 1, do
    begin
       do something
    end
end

假设我希望我的代码应该在第一次迭代后将行 x := 200 更改为其他行 x := 199 ，然后在下一次迭代后将其更改为 x := 198 等等。

写这样的代码可能吗？我需要为此使用内联汇编吗？

编辑：这就是为什么我想用 C 语言来做：

这个程序将在实验操作系统上运行，我不能/不知道如何使用从其他语言编译的程序。我需要这样的代码的真正原因是因为该代码正在虚拟机上的客户操作系统上运行。虚拟机管理程序是一个二进制翻译器，用于翻译代码块。翻译者做了一些优化。它只翻译代码块一次。下次在来宾中使用相同的块时，翻译器将使用之前翻译的结果。现在，如果代码被即时修改，翻译器就会注意到这一点，并将其先前的翻译标记为过时。从而强制重新翻译相同的代码。这就是我想要达到的目的，迫使译者做很多翻译。通常，这些块是分支指令（例如跳转指令）之间的指令。我只是认为自修改代码是实现这一目标的绝佳方法。

原文

I want to write a piece of code that changes itself continuously, even if the change is insignificant.

For example maybe something like

for i in 1 to  100, do 
begin
   x := 200
   for j in 200 downto 1, do
    begin
       do something
    end
end

Suppose I want that my code should after first iteration change the line x := 200 to some other line x := 199 and then after next iteration change it to x := 198 and so on.

Is writing such a code possible ? Would I need to use inline assembly for that ?

EDIT :
Here is why I want to do it in C:

This program will be run on an experimental operating system and I can't / don't know how to use programs compiled from other languages. The real reason I need such a code is because this code is being run on a guest operating system on a virtual machine. The hypervisor is a binary translator that is translating chunks of code. The translator does some optimizations. It only translates the chunks of code once. The next time the same chunk is used in the guest, the translator will use the previously translated result. Now, if the code gets modified on the fly, then the translator notices that, and marks its previous translation as stale. Thus forcing a re-translation of the same code. This is what I want to achieve, to force the translator to do many translations. Typically these chunks are instructions between to branch instructions (such as jump instructions). I just think that self modifying code would be fantastic way to achieve this.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不醒的梦 2024-12-11 17:26:22

您可能需要考虑用 C 语言编写虚拟机，您可以在其中构建自己的自修改代码。

如果您希望编写自修改可执行文件，很大程度上取决于您所针对的操作系统。您可以通过修改内存中的程序映像来实现您想要的解决方案。为此，您将获得程序代码字节的内存地址。然后，您可以操纵此内存范围上的操作系统保护，从而允许您修改字节而不会遇到访问冲突或“SIG_SEGV”。最后，您将使用指针（在 RISC 机器上可能是 '''unsigned char *''' 指针，也可能是 '''unsigned long *'''）来修改已编译程序的操作码。

关键点是您将修改目标体系结构的机器代码。 C 代码在运行时没有规范的格式——C 是编译器的文本输入文件的规范。

回复收藏 0 原文

失眠症患者 2024-12-11 17:26:22

抱歉，我回答有点晚了，但我想我找到了您正在寻找的东西：https://shanetully.com/2013/12/writing-a-self-mutating-x86_64-c-program/

在本文中，它们通过在堆栈中注入程序集来更改常量的值。然后他们通过修改堆栈上函数的内存来执行 shellcode。

下面是第一个代码：

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>

void foo(void);
int change_page_permissions_of_address(void *addr);

int main(void) {
    void *foo_addr = (void*)foo;

    // Change the permissions of the page that contains foo() to read, write, and execute
    // This assumes that foo() is fully contained by a single page
    if(change_page_permissions_of_address(foo_addr) == -1) {
        fprintf(stderr, "Error while changing page permissions of foo(): %s\n", strerror(errno));
        return 1;
    }

    // Call the unmodified foo()
    puts("Calling foo...");
    foo();

    // Change the immediate value in the addl instruction in foo() to 42
    unsigned char *instruction = (unsigned char*)foo_addr + 18;
    *instruction = 0x2A;

    // Call the modified foo()
    puts("Calling foo...");
    foo();

    return 0;
}

void foo(void) {
    int i=0;
    i++;
    printf("i: %d\n", i);
}

int change_page_permissions_of_address(void *addr) {
    // Move the pointer to the page boundary
    int page_size = getpagesize();
    addr -= (unsigned long)addr % page_size;

    if(mprotect(addr, page_size, PROT_READ | PROT_WRITE | PROT_EXEC) == -1) {
        return -1;
    }

    return 0;
}

Sorry, I am answering a bit late, but I think I found exactly what you are looking for : https://shanetully.com/2013/12/writing-a-self-mutating-x86_64-c-program/

In this article, they change the value of a constant by injecting assembly in the stack. Then they execute a shellcode by modifying the memory of a function on the stack.

Below is the first code :

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>

void foo(void);
int change_page_permissions_of_address(void *addr);

int main(void) {
    void *foo_addr = (void*)foo;

    // Change the permissions of the page that contains foo() to read, write, and execute
    // This assumes that foo() is fully contained by a single page
    if(change_page_permissions_of_address(foo_addr) == -1) {
        fprintf(stderr, "Error while changing page permissions of foo(): %s\n", strerror(errno));
        return 1;
    }

    // Call the unmodified foo()
    puts("Calling foo...");
    foo();

    // Change the immediate value in the addl instruction in foo() to 42
    unsigned char *instruction = (unsigned char*)foo_addr + 18;
    *instruction = 0x2A;

    // Call the modified foo()
    puts("Calling foo...");
    foo();

    return 0;
}

void foo(void) {
    int i=0;
    i++;
    printf("i: %d\n", i);
}

int change_page_permissions_of_address(void *addr) {
    // Move the pointer to the page boundary
    int page_size = getpagesize();
    addr -= (unsigned long)addr % page_size;

    if(mprotect(addr, page_size, PROT_READ | PROT_WRITE | PROT_EXEC) == -1) {
        return -1;
    }

    return 0;
}

回复收藏 0 原文

祁梦 2024-12-11 17:26:22

这是可能的，但它很可能无法移植，并且您可能必须应对运行代码的只读内存段以及操作系统设置的其他障碍。

回复收藏 0 原文

柠檬色的秋千 2024-12-11 17:26:22

这将是一个好的开始。本质上是 C 语言的 Lisp 功能：

http: //nakkaya.com/2010/08/24/a-micro-manual-for-lisp-implemented-in-c/

回复收藏 0 原文

不奢求什么 2024-12-11 17:26:22

根据您需要多少自由度，您也许可以通过使用函数指针来完成您想要的任务。使用伪代码作为起点，考虑这样的情况：当循环索引 i 发生变化时，我们希望以不同的方式修改该变量 x。我们可以这样做：

#include <stdio.h>

void multiply_x (int * x, int multiplier)
{
    *x *= multiplier;
}

void add_to_x (int * x, int increment)
{
    *x += increment;
}

int main (void)
{
    int x = 0;
    int i;

    void (*fp)(int *, int);

    for (i = 1; i < 6; ++i) {
            fp = (i % 2) ? add_to_x : multiply_x;

            fp(&x, i);

            printf("%d\n", x);
    }

    return 0;
}

当我们编译并运行程序时，输出是：

显然，只有在每次运行时您想要使用 x 执行的操作数量有限时，这才有效。为了使更改持久化（这是您希望“自我修改”的一部分），您需要将函数指针变量设置为全局变量或静态变量。我不确定我是否真的可以推荐这种方法，因为通常有更简单、更清晰的方法来完成此类事情。

Depending on how much freedom you need, you may be able to accomplish what you want by using function pointers. Using your pseudocode as a jumping-off point, consider the case where we want to modify that variable x in different ways as the loop index i changes. We could do something like this:

#include <stdio.h>

void multiply_x (int * x, int multiplier)
{
    *x *= multiplier;
}

void add_to_x (int * x, int increment)
{
    *x += increment;
}

int main (void)
{
    int x = 0;
    int i;

    void (*fp)(int *, int);

    for (i = 1; i < 6; ++i) {
            fp = (i % 2) ? add_to_x : multiply_x;

            fp(&x, i);

            printf("%d\n", x);
    }

    return 0;
}

The output, when we compile and run the program, is:

Obviously, this will only work if you have finite number of things you want to do with x on each run through. In order to make the changes persistent (which is part of what you want from "self-modification"), you would want to make the function-pointer variable either global or static. I'm not sure I really can recommend this approach, because there are often simpler and clearer ways of accomplishing this sort of thing.

回复收藏 0 原文

倾城花音 2024-12-11 17:26:22

自解释语言（不像 C 那样硬编译和链接）可能更适合这一点。 Perl、javascript、PHP 都有邪恶的 eval() 函数，可能适合您的目的。通过它，您可以得到一串不断修改的代码，然后通过 eval() 执行。

回复收藏 0 原文

自由如风 2024-12-11 17:26:22

出于可移植性的考虑，关于用 C 实现 LISP 然后使用它的建议是可靠的。但如果您确实愿意，也可以在许多系统上以另一个方向实现这一点，即将程序的字节码加载到内存中，然后返回它。

您可以尝试通过几种方法来做到这一点。一种方法是通过缓冲区溢出漏洞利用。另一种方法是使用 mprotect() 使代码段可写，然后修改编译器创建的函数。

像这样的技术对于编程挑战和混乱的竞赛来说很有趣，但是考虑到您的代码将如何不可读，并且您正在利用 C 认为未定义的行为，因此最好在生产环境中避免使用它们。

回复收藏 0 原文

还如梦归 2024-12-11 17:26:22

在标准 C11 中（请阅读 n1570）， 你不能编写自修改代码（至少没有未定义行为）。至少从概念上讲，代码段是只读的。

您可以考虑使用您的插件来扩展程序的代码href="https://en.wikipedia.org/wiki/Dynamic_linker" rel="nofollow noreferrer">动态链接器。这需要操作系统特定的功能。在 POSIX 上，使用 dlopen （可能还有 dlsym 来获取新加载的函数指针）。然后，您可以使用新函数指针的地址覆盖函数指针。

也许您可以使用一些 JIT 编译库（例如 libgccjit 或 asmjit) 来实现您的目标。您将获得新的函数地址并将它们放入函数指针中。

请记住，C 编译器可以为给定的函数调用或跳转生成各种大小的代码，因此即使以机器特定的方式覆盖它也是脆弱的。

回复收藏 0 原文

伪心 2024-12-11 17:26:22

我和我的朋友在开发一个可以自我修改代码的游戏时遇到了这个问题。我们允许用户在 x86 程序集中重写代码片段。

这只需要利用两个库——一个汇编器和一个反汇编器：

FASM 汇编器：https://github。 com/ZenLulz/Fasm.NET

Udis86 反汇编器：https://github.com/vmt/udis86

我们使用反汇编器读取指令，让用户编辑它们，使用汇编器将新指令转换为字节，然后将它们写回内存。回写需要在 Windows 上使用 VirtualProtect 来更改页面权限以允许编辑代码。在 Unix 上，您必须使用 mprotect 来代替。

我发布了一篇文章我们是如何做到的，以及示例代码。

这些示例是在 Windows 上使用 C++ 实现的，但制作跨平台且仅用 C 语言应该很容易。

回复收藏 0 原文

情话已封尘 2024-12-11 17:26:22

这是在 Windows 上使用 C++ 执行此操作的方法。您必须 VirtualAlloc 具有读/写保护的字节数组，将代码复制到那里，然后使用读/执行保护 VirtualProtect 它。以下是如何动态创建一个不执行任何操作并返回的函数。

#include <cstdio>
#include <Memoryapi.h>
#include <windows.h>
using namespace std;
typedef unsigned char byte;

int main(int argc, char** argv){
    byte bytes [] = { 0x48, 0x31, 0xC0, 0x48, 0x83, 0xC0, 0x0F, 0xC3 }; //put code here
    //xor %rax, %rax
    //add %rax, 15
    //ret
    int size = sizeof(bytes);
    DWORD protect = PAGE_READWRITE;
    void* meth = VirtualAlloc(NULL, size, MEM_COMMIT, protect);
    byte* write = (byte*) meth;
    for(int i = 0; i < size; i++){
        write[i] = bytes[i];
    }
    if(VirtualProtect(meth, size, PAGE_EXECUTE_READ, &protect)){
        typedef int (*fptr)();
        fptr my_fptr = reinterpret_cast<fptr>(reinterpret_cast<long>(meth));
        int number = my_fptr();
        for(int i = 0; i < number; i++){
            printf("I will say this 15 times!\n");
        }
        return 0;
    } else{
        printf("Unable to VirtualProtect code with execute protection!\n");
        return 1;
    }
}

您可以使用此工具来汇编代码。

This is how to do it on windows with c++. You'll have to VirtualAlloc a byte array with read/write protections, copy your code there, and VirtualProtect it with read/execute protections. Here's how you dynamically create a function that does nothing and returns.

#include <cstdio>
#include <Memoryapi.h>
#include <windows.h>
using namespace std;
typedef unsigned char byte;

int main(int argc, char** argv){
    byte bytes [] = { 0x48, 0x31, 0xC0, 0x48, 0x83, 0xC0, 0x0F, 0xC3 }; //put code here
    //xor %rax, %rax
    //add %rax, 15
    //ret
    int size = sizeof(bytes);
    DWORD protect = PAGE_READWRITE;
    void* meth = VirtualAlloc(NULL, size, MEM_COMMIT, protect);
    byte* write = (byte*) meth;
    for(int i = 0; i < size; i++){
        write[i] = bytes[i];
    }
    if(VirtualProtect(meth, size, PAGE_EXECUTE_READ, &protect)){
        typedef int (*fptr)();
        fptr my_fptr = reinterpret_cast<fptr>(reinterpret_cast<long>(meth));
        int number = my_fptr();
        for(int i = 0; i < number; i++){
            printf("I will say this 15 times!\n");
        }
        return 0;
    } else{
        printf("Unable to VirtualProtect code with execute protection!\n");
        return 1;
    }
}

You assemble the code using this tool.

回复收藏 0 原文

淡淡绿茶香 2024-12-11 17:26:22

虽然 C 语言中“真正的”自修改代码是不可能的（汇编方式感觉有点作弊，因为此时，我们正在汇编语言中编写自修改代码，而不是 C 语言，这是最初的问题），但可能存在纯 C 方法使语句产生类似的效果，但矛盾的是没有做您认为应该做的事情。我说矛盾的是，因为 ASM 自修改代码和下面的 C 代码片段表面上/直观上可能没有意义，但如果你把直觉放在一边并进行逻辑分析，那就是合乎逻辑的，这就是使悖论成为悖论的差异。

#include <stdio.h>
#include <string.h>

int main()
{
    struct Foo
    {
        char a;
        char b[4];
    } foo;

    foo.a = 42;
    strncpy(foo.b, "foo", 3);
    printf("foo.a=%i, foo.b=\"%s\"\n", foo.a, foo.b);

    *(int*)&foo.a = 1918984746;
    printf("foo.a=%i, foo.b=\"%s\"\n", foo.a, foo.b);

    return 0;
}

$ gcc -o foo foo.c && ./foo
foo.a=42, foo.b="foo"
foo.a=42, foo.b="bar"

首先，我们更改 foo.a 和 foo.b 的值并打印该结构。然后我们只更改 foo.a 的值，但观察输出。

While "true" self modifying code in C is impossible (the assembly way feels like slight cheat, because at this point, we're writing self modifying code in assembly and not in C, which was the original question), there might be a pure C way to make the similar effect of statements paradoxically not doing what you think are supposed do to. I say paradoxically, because both the ASM self modifying code and the following C snippet might not superficially/intuitively make sense, but are logical if you put intuition aside and do a logical analysis, which is the discrepancy which makes paradox a paradox.

#include <stdio.h>
#include <string.h>

int main()
{
    struct Foo
    {
        char a;
        char b[4];
    } foo;

    foo.a = 42;
    strncpy(foo.b, "foo", 3);
    printf("foo.a=%i, foo.b=\"%s\"\n", foo.a, foo.b);

    *(int*)&foo.a = 1918984746;
    printf("foo.a=%i, foo.b=\"%s\"\n", foo.a, foo.b);

    return 0;
}

$ gcc -o foo foo.c && ./foo
foo.a=42, foo.b="foo"
foo.a=42, foo.b="bar"

First, we change the value of foo.a and foo.b and print the struct. Then we change only the value of foo.a, but observe the output.

回复收藏 0 原文

~没有更多了~

关于作者

梦明

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何用C语言编写自修改代码？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（11）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如何用C语言编写自修改代码？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（11）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。