当前位置：文江博客话题详情

c metaprogramming self-modifying

C 程序可以修改其可执行文件吗？

发布于 2024-09-26 23:33:36 字数 582 浏览 7 评论 0 原文

我手上的时间有点多，开始考虑是否可以编写一个自我修改的程序。为此，我用 C 语言编写了一个“Hello World”，然后使用十六进制编辑器在编译后的可执行文件中查找“Hello World”字符串的位置。是否可以修改该程序以打开自身并覆盖“Hello World”字符串？

char* str = "Hello World\n";

int main(int argc, char* argv) {

  printf(str);

  FILE * file = fopen(argv, "r+");

  fseek(file, 0x1000, SEEK_SET);
  fputs("Goodbyewrld\n", file);      
  fclose(file);    

  return 0;
}

这是行不通的，我假设有什么东西阻止它自行打开，因为我可以将其分成两个单独的程序（一个“Hello World”和一些修改它的东西）并且它工作得很好。

编辑：我的理解是，当程序运行时，它会完全加载到内存中。因此，出于所有意图和目的，硬盘驱动器上的可执行文件都是副本。为什么它自身修改会出现问题？

有解决方法吗？

谢谢

原文

I had a little too much time on my hands and started wondering if I could write a self-modifying program. To that end, I wrote a "Hello World" in C, then used a hex editor to find the location of the "Hello World" string in the compiled executable. Is it possible to modify this program to open itself and overwrite the "Hello World" string?

char* str = "Hello World\n";

int main(int argc, char* argv) {

  printf(str);

  FILE * file = fopen(argv, "r+");

  fseek(file, 0x1000, SEEK_SET);
  fputs("Goodbyewrld\n", file);      
  fclose(file);    

  return 0;
}

This doesn't work, I'm assuming there's something preventing it from opening itself since I can split this into two separate programs (A "Hello World" and something to modify it) and it works fine.

EDIT: My understanding is that when the program is run, it's loaded completely into ram. So the executable on the hard drive is, for all intents and purposes a copy. Why would it be a problem for it to modify itself?

Is there a workaround?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

恍梦境° 2024-10-03 23:33:36

在 Windows 上，当程序运行时，整个 *.exe 文件将使用映射到内存中-us/library/aa366556(VS.85).aspx" rel="noreferrer">Windows 中的内存映射文件函数。这意味着文件不一定会立即全部加载，而是在访问文件时按需加载文件的页面。

当文件以这种方式映射时，另一个应用程序（包括其自身）在运行时无法写入同一文件来更改它。（此外，在 Windows 上运行的可执行文件也不能重命名，但在 Linux 和其他具有基于 inode 的文件系统的 Unix 系统上可以）。

可以更改映射到内存的位，但如果这样做，操作系统会使用“写时复制”语义来执行此操作，这意味着磁盘上的底层文件不会更改，而是页面的副本（内存中的 s) 是根据您的修改进行的。不过，在被允许执行此操作之前，您通常必须修改相关内存上的保护位（例如 VirtualProtect）。

曾经，在内存非常有限的环境中，低级汇编程序使用自修改代码曾经很常见。然而，没有人再这样做了，因为我们不再在相同的受限环境中运行，而且现代处理器具有很长的管道，如果您开始从它们下面更改代码，就会感到非常不安。

回复收藏 0 原文

山有枢 2024-10-03 23:33:36

如果您使用的是 Windows，则可以执行以下操作：

分步示例：

在要修改的代码页上调用 VirtualProtect()，并使用 PAGE_WRITECOPY 保护。
修改代码页。
在修改的代码页上调用 VirtualProtect()，并使用 PAGE_EXECUTE 保护。
调用FlushInstructionCache()。

有关详细信息，请参阅如何修改可执行代码记忆中（存档日期：2010 年 8 月）

回复收藏 0 原文

究竟谁懂我的在乎 2024-10-03 23:33:36

它非常依赖于操作系统。某些操作系统会锁定该文件，因此您可以尝试通过在某处制作该文件的新副本来进行欺骗，但您只是在运行该程序的另一个版本。

其他操作系统会对该文件进行安全检查，例如 iPhone，因此写入该文件将是一项繁重的工作，而且它作为只读文件驻留。

对于其他系统，您甚至可能不知道文件在哪里。

回复收藏 0 原文

对风讲故事 2024-10-03 23:33:36

目前的所有答案或多或少都围绕着这样一个事实：今天您无法再轻松地进行自我修改机器代码。我同意这对于今天的个人电脑来说基本上是正确的。

但是，如果您确实想查看自己的自修改代码的实际效果，您可以使用一些可能性：

尝试微控制器，较简单的微控制器没有高级流水线。我发现的最便宜、最快的选择是 MSP430 USB-Stick
如果仿真适合您，您可以为较旧的非流水线平台运行仿真器。
如果您只是为了好玩而想要自我修改代码，您可以在 Corewars。
如果你愿意从 C 语言转向 Lisp 方言，那么在那里编写代码是非常自然的。我建议方案故意保持较小。

回复收藏 0 原文

百合的盛世恋 2024-10-03 23:33:36

如果我们谈论的是在 x86 环境中执行此操作，那么这应该不是不可能的。但应谨慎使用，因为 x86 指令是可变长度的。长指令可能会覆盖后面的指令，而较短的指令会留下被覆盖指令的残留数据，这些数据应该被拒绝（NOP 指令）。

当 x86 第一次受到保护时，英特尔参考手册建议使用以下方法来调试对 XO（仅执行）区域的访问：

创建一个新的空选择器（远指针的“高”部分）
将其属性设置为 XO 区域的
属性如果您只想查看其中的内容，则新选择器的访问属性必须设置为 RO DATA
如果您想修改数据，则访问属性必须设置为 RW DATA

所以问题的答案就在最后一步。如果您希望能够插入调试器所做的断点指令，则 RW 是必需的。比 80286 更现代的处理器具有内部调试寄存器，以启用非侵入式监视功能，这可能会导致发出断点。

Windows 从 Win16 开始提供了执行此操作的构建块。他们可能还在原地。我认为微软将此类指针操作称为“thunking”。

我曾经用 PL/M-86 for DOS 编写过一个非常快的 16 位数据库引擎。当 Windows 3.1 到来时（运行在 80386 上），我将其移植到 Win16 环境。我想利用可用的 32 位内存，但没有可用的 PL/M-32（或 Win32）。

为了解决这个问题，我的程序使用thunking以下列方式

定义了32位远指针（sel_16：offs_32），使用
使用全局内存分配32位数据区域（＆lt;=＆gt;＆gt;64KB大小）的结构，并在16中接收它们位远指针 (sel_16:offs_16) 格式
通过复制选择器填充结构中的数据，然后使用 16 位乘法与 32 位结果计算偏移量。
使用指令大小覆盖前缀将指针/结构加载到 es:ebx 使用
指令大小和操作数大小前缀的组合访问数据

一旦该机制没有错误，它就可以顺利工作。我的程序使用的最大内存区域是 2304*2304 双精度，大约为 40MB。即使在今天，我仍将其称为“大”内存块。 1995 年，它是典型 SDRAM 棒 (128 MB PC100) 的 30%。

If we're talking about doing this in an x86 environment it shouldn't be impossible. It should be used with caution though because x86 instructions are variable-length. A long instruction may overwrite the following instruction(s) and a shorter one will leave residual data from the overwritten instruction which should be noped (NOP instruction).

When the x86 first became protected the intel reference manuals recommended the following method for debugging access to XO (execute only) areas:

create a new, empty selector ("high" part of far pointers)
set its attributes to that of the XO area
the new selector's access properties must be set RO DATA if you only want to look at what's in it
if you want to modify the data the access properties must be set to RW DATA

So the answer to the problem is in the last step. The RW is necessary if you want to be able to insert the breakpoint instruction which is what debuggers do. More modern processors than the 80286 have internal debug registers to enable non-intrusive monitoring functionality which could result in a breakpoint being issued.

Windows made available the building blocks for doing this starting with Win16. They are probably still in place. I think Microsoft calls this class of pointer manipulation "thunking."

I once wrote a very fast 16-bit database engine in PL/M-86 for DOS. When Windows 3.1 arrived (running on 80386s) I ported it to the Win16 environment. I wanted to make use of the 32-bit memory available but there was no PL/M-32 available (or Win32 for that matter).

to solve the problem my program used thunking in the following way

defined 32-bit far pointers (sel_16:offs_32) using structures
allocated 32-bit data areas (<=> >64KB size) using global memory and received them in 16-bit far pointer (sel_16:offs_16) format
filled in the data in the structures by copying the selector, then calculating the offset using 16-bit multiplication with 32-bit results.
loaded the pointer/structure into es:ebx using the instruction size override prefix
accessed the data using a combination of the instruction size and operand size prefixes

Once the mechanism was bug free it worked without a hitch. The largest memory areas my program used were 2304*2304 double precision which comes out to around 40MB. Even today, I would call this a "large" block of memory. In 1995 it was 30% of a typical SDRAM stick (128 MB PC100).

回复收藏 0 原文

雨轻弹 2024-10-03 23:33:36

在许多平台上都有不可移植的方法来执行此操作。例如，在 Windows 中，您可以使用 WriteProcessMemory() 来执行此操作。然而，在 2010 年，这样做通常是一个非常糟糕的主意。现在已经不是使用汇编代码来节省空间的 DOS 时代。这很难做到正确，而且你基本上是在要求稳定性和安全问题。除非您正在做一些非常低级的事情，例如调试器，否则我想说不要为此烦恼，否则您将引入的问题不值得您获得任何收益。