为什么该程序的 32 位和 64 位编译版本以这种方式填充内存?
我试图更好地理解堆栈和堆是如何工作的。在比较同一程序的 32 位和 64 位编译版本时,我遇到了一个障碍。在这两种情况下,我都使用了 guest Fedora 15 VM(32 位和 64 位)、gcc 用于编译、gdb 用于调试以及相同的主机硬件。有问题的程序非常简单,如下所示:
C 程序
void function(int a, int b, int c, int d){
int value;
char buffer[10];
value = 1234;
buffer[0] = 'A';
}
int main(){
function(1, 2, 3, 4);
}
为了节省空间,我省略了程序的汇编转储;但是,如果有人认为这可以帮助他们回答我的问题,我很乐意将其包括在内。
32 位编译程序:
参数 4 (0xbffff3e4)、3 (0xbffff3e0)、2 (0xbffff3dc) 和 1 (0xbffff3d8) 首先压入堆栈。接下来,调用 function() 后的指令位置(或返回地址)被放置在堆栈上 (0x080483d1)。接下来,前一个堆栈的基指针的值 (0xbffff3e8) 被压入堆栈。
(gdb) x/16xw $esp
0xbffff3c0: 0x00000000 0x410759c3 0x4105d237 0x00000000
0xbffff3d0: 0xbffff3e8 0x080483d1 0x00000001 0x00000002//pointers
0xbffff3e0: 0x00000003 0x00000004 0x00000000 0x4105d413//followed by params
0xbffff3f0: 0x00000001 0xbffff484 0xbffff48c 0x41040fc4
64位编译程序:
但是;这里看不到值 4、3、2 和 1。无论我在堆栈中查看多远,我所能看到的都是返回地址(0x4004ae)和前一个堆栈帧的基指针(0x7fffffffe210)。
(gdb) x/16xg $rsp
0x7fffffffe200: 0x00007fffffffe210 0x00000000004004ae //pointers
0x7fffffffe210: 0x0000000000000000 0x00000036d042139d
0x7fffffffe220: 0x0000000000000000 0x00007fffffffe2f8
0x7fffffffe230: 0x0000000100000000 0x0000000000400491
0x7fffffffe240: 0x0000000000000000 0x7ade47f577d82f75
0x7fffffffe250: 0x0000000000400390 0x00007fffffffe2f0
0x7fffffffe260: 0x0000000000000000 0x0000000000000000
0x7fffffffe270: 0x8521b80ab3982f75 0x7ab3e77151682f75
带打印语句的 64 位编译程序:
现在,在 function() 中添加简单的打印语句后
printf("%d, %c\n", flag, buffer[0]);
,我可以看到任性的参数(见下文,0x7fffffffe1e0-0x7fffffffe1ec)。我还可以看到前一个堆栈帧的基指针 0x7fffffffe210(在 0x7fffffffe200 中)和返回地址 0x400520(在 0x7fffffffe208 中)。我相信它是由于新的打印声明而改变的。 为什么在这种情况下,如果没有 print 语句,4、3、2 和 1 是不可见的? gcc 编译器的 64 位实现是否足够聪明,不会“浪费”内存用于从未使用过的参数和局部变量?
(gdb) x/16xg $rsp
0x7fffffffe1e0: 0x0000000300000004 0x0000000100000002 //parameters
0x7fffffffe1f0: 0x0000000000000000 0x00000000004003e0
0x7fffffffe200: 0x00007fffffffe210 0x0000000000400520 //pointers
0x7fffffffe210: 0x0000000000000000 0x00000036d042139d
0x7fffffffe220: 0x0000000000000000 0x00007fffffffe2f8
0x7fffffffe230: 0x0000000100000000 0x0000000000400503
0x7fffffffe240: 0x0000000000000000 0xd3c0c92559feaed9
0x7fffffffe250: 0x00000000004003e0 0x00007fffffffe2f0
最后,为什么 32 位操作系统将参数 4、3、 2,并且堆栈中的 1 比前面提到的指针高。为什么 64 位操作系统将参数放置在堆栈中低于而不是所述指针?我的印象是传递的参数总是首先放置在堆栈中(因此,会位于较大值的内存地址中,因为堆栈向较小的地址增长)。然后是保存的基指针和返回地址(因此基指针可以重置为其先前的值,并且可以返回调用函数)。这是我在 32 位编译代码中观察到的行为,但不是 64 位版本。我有什么误解吗?我很感激对此事的任何见解,如果我的问题不清楚,我深表歉意。请让我知道任何可以更简洁的方式(或者如果我在任何时候实际上不正确)。
I am trying to better understand how the stack and heap work. I have run into a snag when comparing the 32-bit and 64-bit compiled versions of the same program. In both cases I used a guest Fedora 15 VM (both 32 and 64), gcc for compiling, gdb for debugging, and the same host hardware. The program in question is very simple and immediately below:
C program
void function(int a, int b, int c, int d){
int value;
char buffer[10];
value = 1234;
buffer[0] = 'A';
}
int main(){
function(1, 2, 3, 4);
}
In the interest of space, I omitted the assembly dump of the program; however if anyone thinks it might help them answer my questions, I'd be happy to include it.
32-bit Compiled Program:
Parameters 4 (0xbffff3e4), 3 (0xbffff3e0), 2 (0xbffff3dc) and 1 (0xbffff3d8) are pushed onto the stack first. Next the location of the instruction following the call for function()--or return address--is placed on the stack (0x080483d1). Next the value of the base pointer for the previous stack (0xbffff3e8) is pushed on to the stack.
(gdb) x/16xw $esp
0xbffff3c0: 0x00000000 0x410759c3 0x4105d237 0x00000000
0xbffff3d0: 0xbffff3e8 0x080483d1 0x00000001 0x00000002//pointers
0xbffff3e0: 0x00000003 0x00000004 0x00000000 0x4105d413//followed by params
0xbffff3f0: 0x00000001 0xbffff484 0xbffff48c 0x41040fc4
64-bit Compiled Program:
However; here the values 4, 3, 2, and 1 are nowhere to be seen. All I can see, no matter how far down the stack I look is the return address (0x4004ae) and previous stack frame's Base Pointer (0x7fffffffe210).
(gdb) x/16xg $rsp
0x7fffffffe200: 0x00007fffffffe210 0x00000000004004ae //pointers
0x7fffffffe210: 0x0000000000000000 0x00000036d042139d
0x7fffffffe220: 0x0000000000000000 0x00007fffffffe2f8
0x7fffffffe230: 0x0000000100000000 0x0000000000400491
0x7fffffffe240: 0x0000000000000000 0x7ade47f577d82f75
0x7fffffffe250: 0x0000000000400390 0x00007fffffffe2f0
0x7fffffffe260: 0x0000000000000000 0x0000000000000000
0x7fffffffe270: 0x8521b80ab3982f75 0x7ab3e77151682f75
64-bit Compiled Program with print statement:
Now, after adding a simple print statement:
printf("%d, %c\n", flag, buffer[0]);
in function(), I can see the wayward parameters (see below, 0x7fffffffe1e0-0x7fffffffe1ec). I can also see the Base Pointer from the previous stack frame, 0x7fffffffe210 (in 0x7fffffffe200) and the return address 0x400520 (in 0x7fffffffe208). I believe it changed due to the new print statement. Why are 4, 3, 2, and 1 not visible without a print statement in this case? Is the 64-bit implementation of the gcc compiler smart enough to not 'waste' memory for parameters and local variables which are never used?
(gdb) x/16xg $rsp
0x7fffffffe1e0: 0x0000000300000004 0x0000000100000002 //parameters
0x7fffffffe1f0: 0x0000000000000000 0x00000000004003e0
0x7fffffffe200: 0x00007fffffffe210 0x0000000000400520 //pointers
0x7fffffffe210: 0x0000000000000000 0x00000036d042139d
0x7fffffffe220: 0x0000000000000000 0x00007fffffffe2f8
0x7fffffffe230: 0x0000000100000000 0x0000000000400503
0x7fffffffe240: 0x0000000000000000 0xd3c0c92559feaed9
0x7fffffffe250: 0x00000000004003e0 0x00007fffffffe2f0
Finally, why does the 32 bit OS place the parameters 4, 3, 2, and 1 higher in the stack than it does the previously mentioned pointers. And why does the 64 bit OS instead place the parameters lower in the stack than said pointers? I was under the impression that passed parameters were always placed on the stack first (and hence, would be in a larger-value memory address since the stack grows toward smaller addresses). Then the saved base pointer and return address followed (so the base pointer could be reset to its previous value and the calling function could be returned to). This is the behavior I am observing in the 32-bit compiled code, but not the 64-bit version. What am I misunderstanding? I appreciate any insight into this matter and apologize if my questions are unclear. Please let me know any way I can be more concise (or if I am factually incorrect at any point).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Linux 使用的 64 位 ABI 与 32 位 ABI 有很大不同:在 64 位世界中,参数通常在寄存器中传递,而不是在堆栈上传递。
在添加
printf()
之前,您在堆栈上找不到参数,因为前(最多)6 个整数或指针参数在寄存器中传递(按照%rdi< /code>、
%rsi
、%rdx
、%rcx
、%r8
、%r9< /代码>)。
添加
printf()
后,它们可能会在寄存器内容被打乱以供printf()
调用的过程中保存在堆栈上 - 看一下程序集;一旦您知道 ABI 是什么样子,这可能就很明显了。The 64-bit ABI used by Linux differs considerably from the 32-bit ABI: in the 64-bit world, arguments are often passed in registers, rather than on the stack.
Before adding the
printf()
, you're not finding the arguments on the stack because the first (up to) 6 integer or pointer arguments get passed in registers (in the order%rdi
,%rsi
,%rdx
,%rcx
,%r8
,%r9
).After adding the
printf()
, they probably get saved on the stack in the process of register contents being shuffled around for theprintf()
call - take a look at the assembly; it's probably obvious once you know what the ABI looks like.i386 SysV 在堆栈上传递参数。
x86-64 SysV(所有非 Windows 系统)在寄存器中传递前 6 个整数/指针参数。
在调试版本中,寄存器参数将溢出到堆栈内存。 (除非您使用
register
关键字。对于 GCC / Clang-Og
或更高版本没有任何作用,但对-O0
有效。)< strong>在 x86-64 上的叶函数中,它将位于下面的红色区域 RSP,但仅限 GDB
x
命令查看$rsp
及以上。检查生成的 asm 并注意缺少 sub rsp, 24 或其他内容,因此 RBP(帧指针)的负偏移量低于 RSP。 (将您的代码放在 https://godbolt.org/ 上和/或参见
/stackoverflow.com/questions/38552116/how-to-remove-noise-from-gcc-clang-assemble-output">如何 添加了
printf
,它不再是叶函数了。编译器保留堆栈空间,以便变量转储到 RSP 上方,但仍低于返回地址。i386 SysV passes args on the stack.
x86-64 SysV (all non-Windows systems) passes the first 6 integer/pointer args in registers.
In a debug build, register args will get spilled to stack memory. (Unless you use the
register
keyword. Does nothing with GCC / Clang-Og
or higher, but has an effect at-O0
.)In a leaf function on x86-64, that will be in the red zone, below RSP but your GDB
x
command only looks at$rsp
and up.Check the generated asm and note the lack of a
sub rsp, 24
or whatever, so the negative offsets from RBP (frame pointer) go below RSP. (put your code on https://godbolt.org/ and/or see How to remove "noise" from GCC/clang assembly output?)With
printf
added, it's not a leaf function anymore. The compiler reserves stack space so the variables get dumped above RSP, but still below the return address.