了解堆栈分配和对齐
我试图了解堆栈对齐如何工作,如什么是“堆栈对齐”?中所述。但我很难找到一个小例子来证明上述行为。我正在检查函数 foo 的堆栈分配:
void foo() {
int a = 0;
char b[16];
b[0] = 'a';
}
我使用 gcc -ggdb example.c -o example.out 编译源文件(即没有任何编译器标志),并且从 gdb 读取汇编器转储:
(gdb) disassemble foo
Dump of assembler code for function foo:
0x08048394 <+0>: push %ebp
0x08048395 <+1>: mov %esp,%ebp
0x08048397 <+3>: sub $0x20,%esp
0x0804839a <+6>: movl $0x0,-0x4(%ebp)
0x080483a1 <+13>: movb $0x61,-0x14(%ebp)
0x080483a5 <+17>: leave
0x080483a6 <+18>: ret
End of assembler dump.
我的堆栈以 16 字节的块分配(我通过其他几个测试验证了这一点)。根据这里的汇编器转储,已经分配了 32 个字节,因为 (16 < 4+16 < 32),但是我希望在前 16 个字节上分配整数“a”,然后在下一个字节上分配字符数组16 个字节(中间留有 12 个字节的空间)。但似乎整数和字符数组都被分配了 20 个字节的连续块,根据我上面提到的讨论,这是低效的。有人可以解释一下我在这里缺少什么吗?
编辑:我得出的结论是,我的堆栈是按 16 个字节的块分配的,程序如下所示:
void foo() {
char a[1];
}
以及相应的汇编程序转储:
(gdb) disassemble foo
Dump of assembler code for function foo:
0x08048394 <+0>: push %ebp
0x08048395 <+1>: mov %esp,%ebp
0x08048397 <+3>: sub $0x10,%esp
0x0804839a <+6>: leave
0x0804839b <+7>: ret
End of assembler dump.
您可以看到堆栈上已分配了 16 个字节大小为 1 的字符数组(仅需要 1 个字节)。我可以将数组的大小增加到 16,并且汇编器转储保持不变,但是当它是 17 时,它在堆栈上分配 32 个字节。我运行了很多这样的样本,结果都是一样的;堆栈内存以 16 字节为块进行分配。 堆栈分配、填充和对齐中讨论了类似的主题,但我'我更热衷于找出为什么对齐在我的示例中没有效果。
I'm trying to understand how stack alignment works as described in what is "stack alignment"? but I have trouble getting a small example to demonstrate the said behaviour. I'm examining the stack allocation of my function foo:
void foo() {
int a = 0;
char b[16];
b[0] = 'a';
}
I compiled the source file with gcc -ggdb example.c -o example.out
(i.e without any compiler flags) and the assembler dump from gdb reads:
(gdb) disassemble foo
Dump of assembler code for function foo:
0x08048394 <+0>: push %ebp
0x08048395 <+1>: mov %esp,%ebp
0x08048397 <+3>: sub $0x20,%esp
0x0804839a <+6>: movl $0x0,-0x4(%ebp)
0x080483a1 <+13>: movb $0x61,-0x14(%ebp)
0x080483a5 <+17>: leave
0x080483a6 <+18>: ret
End of assembler dump.
My stack is allocated in chunks of 16 bytes (I verified this with several other tests). According to the assembler dump here 32 bytes have been allocated because (16 < 4+16 < 32), however I expected integer 'a' to be allocated on the first 16 bytes and then the character array to be allocated on the next 16 bytes (leaving a space of 12 bytes in-between). But it seems both the integer and the character array have been allocated a contiguous chunk of 20 bytes, which is inefficient as per the discussion i referred above. Can someone please explain what I'm missing here?
EDIT: I came to the conclusion that my stack is allocated in chunks of 16 bytes with a program like below:
void foo() {
char a[1];
}
And the corresponding assembler dump:
(gdb) disassemble foo
Dump of assembler code for function foo:
0x08048394 <+0>: push %ebp
0x08048395 <+1>: mov %esp,%ebp
0x08048397 <+3>: sub $0x10,%esp
0x0804839a <+6>: leave
0x0804839b <+7>: ret
End of assembler dump.
You can see that 16 bytes have been allocated on the stack for a character array of size 1 (only 1 byte needed). i can increase the size of the array up to 16 and the assembler dump stays the same, but when it is 17, it allocates 32 bytes on the stack. I have run many such samples and the result is the same; stack memory is allocated in chunks of 16 bytes. A similar topic has been discussed in Stack allocation, padding, and alignment but what I'm more keen on finding out is why alignment has no effect in my example.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我认为您忽略了这样一个事实:不需要所有堆栈变量单独与 16 字节边界对齐。
I think you're missing the fact that there is no requirement for all stack variables to be individually aligned to 16-byte boundaries.
您可以使用名为 pahole http://packages.debian 的工具检查如何为数据结构分配额外内存。 org/lenny/dwarves .它向您显示程序的所有漏洞:数据的大小(如果您将其汇总)以及卡住时分配的实际大小
You can check how extra memory is allocated for your data structure using a tool called pahole http://packages.debian.org/lenny/dwarves . It shows you all the holes of your program: the size of your data if you sum it up and the real size allocated at your stuck
通常的规则是变量在 32 位边界上分配。我不知道为什么你认为 16 字节有任何特殊含义。
The usual rule is that variables are allocated on 32-bit boundaries. I'm not sure why you think 16 bytes has any special meaning.
我从未听说过特定堆栈对齐之类的事情。如果CPU有对齐要求,则对各种数据存储器进行对齐,无论是存储在堆栈上还是其他地方。它从偶数地址开始,后面跟随 16、32 或 64 位数据。
16 字节可能是某种片上高速缓存优化,尽管这对我来说似乎有点牵强。
I've never heard about such a thing as specific stack alignment. If there is alignment requirements for the CPU, alignment is done on all kinds of data memory, no matter if it is stored on the stack or elsewhere. It is starting on an even addresses with 16, 32 or 64 bit of data following.
16 bytes may perhaps be some sort of on-chip cache memory optimization, though that seems a bit far-fetched to me.
一个很好的例子是在结构上看到这一点。
在 32 位系统上,如果单独计算,这将是 4+1+4 字节。
因为结构体及其成员是对齐的,所以“char b”将是 4 个字节,因此将其变为 12 个字节。
使用 Packed 属性,您可以强制它保持最小大小。因此这个结构是 9 个字节。
您也可以检查http://sig9.com/articles/gcc-packed-structs
希望有帮助。
A good example is to see this on a structure.
On a 32 bit system this would be 4+1+4 bytes if taken separately.
Because the structure and it's members are aligned "char b" will be 4 bytes, taking this to 12 bytes.
Using the packed attribute you can force it to keep it's minimum size. Thus this structure is 9 bytes.
You can check this as well http://sig9.com/articles/gcc-packed-structures
Hope it helpes.