为什么在存储超过 BSS 末尾时没有出现分段错误?
我正在尝试使用汇编语言并编写了一个程序,该程序将 2 个硬编码字节打印到标准输出中。如下:
section .text
global _start
_start:
mov eax, 0x0A31
mov [val], eax
mov eax, 4
mov ebx, 1
mov ecx, val
mov edx, 2
int 0x80
mov eax, 1
int 0x80
segment .bss
val resb 1; <------ Here
请注意,我在 bss 段内只保留了 1 个字节,但实际上将 2 个字节(1
的字符代码和 newline
符号)放入内存位置。该程序运行良好。它打印了 1
字符,然后打印了 newline
。
但我预计分段错误。为什么没有发生。我们只保留了1个字节,但是放了2个。
I'm experimenting with assembly language and wrote a program which prints 2 hardcoded bytes into stdout. Here it is:
section .text
global _start
_start:
mov eax, 0x0A31
mov [val], eax
mov eax, 4
mov ebx, 1
mov ecx, val
mov edx, 2
int 0x80
mov eax, 1
int 0x80
segment .bss
val resb 1; <------ Here
Note that I reserved only 1 byte inside the bss segment, but actually put 2 bytes (charcode for 1
and newline
symbol) into the memory location. And the program worked fine. It printed 1
character and then newline
.
But I expected segmentation fault. Why isn't it occured. We reserved only 1 byte, but put 2.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
x86 与大多数其他现代架构一样,使用分页/虚拟内存进行内存保护。 x86(再次像许多其他架构一样),粒度为 4kiB。
将 4 字节存储到
val
不会出错,除非链接器恰好将其放置在页面的最后 3 个字节中,并且下一页未映射。实际发生的情况是,您只需覆盖
val
之后的内容即可。在这种情况下,它只是页面末尾的未使用空间。如果 BSS 中还有其他静态存储位置,您将采用它们的值。 (如果您愿意,可以将它们称为“变量”,但“变量”的高级概念不仅仅意味着内存位置,变量可以存在于寄存器中并且永远不需要有地址。)上面链接的文章,另请参阅:
mov [val], eax
是一个 4 字节存储。操作数大小由寄存器决定。如果您想进行 2 字节存储,请使用mov [val], ax
。有趣的事实:MASM 会就操作数大小不匹配发出警告或错误,因为它根据在符号名称后面保留空间的声明神奇地将大小与符号名称关联起来。 NASM 不会妨碍你,所以如果你写了
mov [val], 0x0A31
,这将是一个错误。两个操作数都隐含大小,因此您需要 mov dword [val], 0x0A31 (或word
或byte
)。将
val
放在页面末尾会出现段错误由于某种原因,BSS 不会在 32 位二进制文件中从页面开头开始,但它位于页面开头附近。您没有链接到任何会占用 BSS 中大部分页面的其他内容。
nm bss-no-segfault
显示它位于0x080490a8
,4k 页为0x1000
字节,因此 BSS 映射中的最后一个字节将为0x08049fff
。当我向
.text
部分添加指令时,BSS 起始地址似乎会发生变化,因此链接器此处的选择可能与将内容打包到 ELF 可执行文件中有关。这没有多大意义,因为BSS并不存储在文件中,它只是基地址+长度。我不会掉进那个兔子洞;我确信使.text
稍微大一点会导致 BSS 从页面开头开始,这是有原因的,但我不知道它是什么。无论如何,如果我们构建 BSS 以使
val
位于页面末尾之前,我们可能会遇到错误:然后构建并运行:
mov [var], eax
段错误,因为它跨越到未映射的页面。mov [var], ax
可以工作(因为我将var
放在页面末尾之前 2 个字节)。此时,
/proc//smaps
显示:关键内容:
rwxp
表示读/写/执行,并且是私有的。即使在第一条指令之前停止,不知何故它已经是“脏的”(即写入)。文本段也是如此,但这是 gdb 将指令更改为int3
所期望的。08049000-0804a000(以及映射的
4 kB
大小)向我们表明 BSS 仅映射了 1 个页面。没有数据段,只有文本和 BSS。x86, like most other modern architectures, uses paging / virtual memory for memory protection. On x86 (again like many other architectures), the granularity is 4kiB.
A 4-byte store to
val
won't fault unless the linker happens to place it in the last 3 bytes of a page, and the next page is unmapped.What actually happens is that you just overwrite whatever is after
val
. In this case, it's just unused space to the end of the page. If you had other static storage locations in the BSS, you'd step on their values. (Call them "variables" if you want, but the high-level concept of a "variable" doesn't just mean a memory location, a variable can be live in a register and never needs to have an address.)Besides the wikipedia article linked above, see also:
mov [val], eax
is a 4-byte store. The operand-size is determined by the register. If you wanted to do a 2-byte store, usemov [val], ax
.Fun fact: MASM would warn or error about an operand-size mismatch, because it magically associates sizes with symbol names based on the declaration that reserves space after them. NASM stays out of your way, so if you wrote
mov [val], 0x0A31
, it would be an error. Neither operand implies a size, so you needmov dword [val], 0x0A31
(orword
orbyte
).Placing
val
at the end of a page to get a segfaultThe BSS for some reason doesn't start at the beginning of a page in a 32-bit binary, but it is near the start of a page. You're not linking with anything else that would use up most of a page in the BSS.
nm bss-no-segfault
shows that it's at0x080490a8
, and a 4k page is0x1000
bytes, so the last byte in the BSS mapping will be0x08049fff
.It seems that the BSS start address changes when I add an instruction to the
.text
section, so presumably the linker's choices here are related to packing things into an ELF executable. It doesn't make much sense, because the BSS isn't stored in the file, it's just a base address + length. I'm not going down that rabbit hole; I'm sure there's a reason that making.text
slightly larger results in a BSS that starts at the beginning of a page, but IDK what it is.Anyway, if we construct the BSS so that
val
is right before the end of a page, we can get a fault:Then build and run:
mov [var], eax
segfaults because it crosses into the unmapped page.mov [var], ax
would works (because I putvar
2 bytes before the end of the page).At this point,
/proc/<PID>/smaps
shows:Key things:
rwxp
means read/write/execute, and private. Even stopped before the first instruction, somehow it's already "dirty" (i.e. written to). So is the text segment, but that's expected from gdb changing the instruction toint3
.The 08049000-0804a000 (and
4 kB
size of the mapping) shows us that the BSS only has 1 page mapped. There's no data segment, just text and BSS.