学习拆解

发布于 2024-10-11 04:07:40 字数 1150 浏览 10 评论 0原文

为了尝试了解下面发生的情况，我编写了一些小型 C 程序，然后反转它，并尝试了解其 objdump 输出。

C 程序是：

#include <stdio.h>

int function(int a, int b, int c) {
    printf("%d, %d, %d\n", a,b,c);
}

int main() {
    int a;
    int *ptr;

    asm("nop");
    function(1,2,3);
}

函数的 objdump 输出给出以下内容。

080483a4 <function>:
 80483a4:   55                      push   ebp
 80483a5:   89 e5                   mov    ebp,esp
 80483a7:   83 ec 08                sub    esp,0x8
 80483aa:   ff 75 10                push   DWORD PTR [ebp+16]
 80483ad:   ff 75 0c                push   DWORD PTR [ebp+12]
 80483b0:   ff 75 08                push   DWORD PTR [ebp+8]
 80483b3:   68 04 85 04 08          push   0x8048504
 80483b8:   e8 fb fe ff ff          call   80482b8 <printf@plt>
 80483bd:   83 c4 10                add    esp,0x10
 80483c0:   c9                      leave

请注意，在调用 printf 之前，偏移量为 8、16、12 的三个 DWORD（它们必须是 function 的参数，顺序相反）被压入堆栈。随后将推送一个十六进制地址，该地址必须是格式字符串的地址。

我的疑问是

我希望看到 esp 被手动递减，然后值被压入堆栈，而不是直接将 3 个 DWORDS 和格式说明符压入堆栈。如何解释这一行为？

原文

In an attempt to understand what occurs underneath I am making small C programs and then reversing it, and trying to understand its objdump output.

The C program is:

#include <stdio.h>

int function(int a, int b, int c) {
    printf("%d, %d, %d\n", a,b,c);
}

int main() {
    int a;
    int *ptr;

    asm("nop");
    function(1,2,3);
}

The objdump output for function gives me the following.

080483a4 <function>:
 80483a4:   55                      push   ebp
 80483a5:   89 e5                   mov    ebp,esp
 80483a7:   83 ec 08                sub    esp,0x8
 80483aa:   ff 75 10                push   DWORD PTR [ebp+16]
 80483ad:   ff 75 0c                push   DWORD PTR [ebp+12]
 80483b0:   ff 75 08                push   DWORD PTR [ebp+8]
 80483b3:   68 04 85 04 08          push   0x8048504
 80483b8:   e8 fb fe ff ff          call   80482b8 <printf@plt>
 80483bd:   83 c4 10                add    esp,0x10
 80483c0:   c9                      leave

Notice that before the call to printf, three DWORD's with offsets 8,16,12(they must be the arguments to function in the reverse order) are being pushed onto the stack. Later a hex address which must be the address of the format string is being pushed.

My doubt is

Rather than pushing 3 DWORDS and the format specifier onto the stack directly, I expected to see the esp being manually decremented and the values being pushed onto the stack after that. How can one explain this behaviour?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦亿 2024-10-18 04:07:40

嗯，有些机器有一个堆栈指针，有点像任何其他寄存器，所以你推入东西的方式是，是的，先递减，然后存储。

但有些机器，例如 x86_32/64，有一条 push 指令来执行宏操作：递减指针并进行存储。

顺便说一句，宏操作有一段有趣的历史。有时，某些机器上的某些示例比使用简单指令执行基本操作要慢。

我怀疑今天是否经常出现这种情况。现代 x86 非常复杂。 CPU 会将操作码本身分解为微操作，然后将其存储在缓存中。微操作具有特定的管道和时隙要求，最终结果是现在 x86 内部有一个 RISC cpu，整个过程运行得非常快并且具有良好的架构层代码密度。

回复收藏 0 原文

墨离汐 2024-10-18 04:07:40

堆栈指针通过push指令进行调整。因此它被复制到ebp，并且参数被推入堆栈，因此它们分别存在于两个位置：function的堆栈和printf的堆栈堆。 pushes 影响 esp，因此 ebp 被复制。

回复收藏 0 原文

屌丝范 2024-10-18 04:07:40

没有mov [esp+x],[ebp+y]指令，操作数太多。它将需要两条指令并使用寄存器。 Push 只需一条指令即可完成。

回复收藏 0 原文

笙痞 2024-10-18 04:07:40

这是 x86 机器的标准 cdecl 调用约定。有几种不同类型的调用约定。您可以在维基百科中阅读以下有关它的文章：

http://en.wikipedia.org/wiki/X86_calling_conventions

解释了基本原理。

回复收藏 0 原文

你是年少的欢喜 2024-10-18 04:07:40

你提出了一个有趣的观点，我认为迄今为止尚未直接解决。我想您已经看到过如下所示的汇编代码：

sub esp, X
...
mov [ebp+Y], eax
call Z

这种反汇编是由某些编译器生成的。它所做的就是扩展堆栈，然后将新空间的值分配为 eax（希望此时已填充了有意义的内容）。这实际上相当于push助记符的作用。我无法回答为什么某些编译器会生成此代码，但我的猜测是，在某些时候这样做被认为更有效。

You raise an interesting point which I think has not been directly addressed so far. I suppose that you have seen assembly code which looks something like this:

sub esp, X
...
mov [ebp+Y], eax
call Z

This sort of disassembly is generated by certain compilers. What it is doing is extending the stack, then assigning the value of the new space to be eax (which has hopefully been populated with something meaningful by that point). This is actually equivalent to what the push mnemonic does. I can't answer why certain compilers generate this code instead but my guess is that at some point doing it this way was judged to be more efficient.

回复收藏 0 原文