为什么在存储超过 BSS 末尾时没有出现分段错误?

发布于 2025-01-16 20:19:31 字数 523 浏览 0 评论 0原文

我正在尝试使用汇编语言并编写了一个程序,该程序将 2 个硬编码字节打印到标准输出中。如下:

section .text
     global _start

_start:
     mov eax, 0x0A31
     mov [val], eax
     mov eax, 4
     mov ebx, 1
     mov ecx, val
     mov edx, 2

     int 0x80

     mov eax, 1
     int 0x80

 segment .bss
     val resb 1;   <------ Here

请注意,我在 bss 段内只保留了 1 个字节,但实际上将 2 个字节(1 的字符代码和 newline 符号)放入内存位置。该程序运行良好。它打印了 1 字符,然后打印了 newline

但我预计分段错误。为什么没有发生。我们只保留了1个字节,但是放了2个。

I'm experimenting with assembly language and wrote a program which prints 2 hardcoded bytes into stdout. Here it is:

section .text
     global _start

_start:
     mov eax, 0x0A31
     mov [val], eax
     mov eax, 4
     mov ebx, 1
     mov ecx, val
     mov edx, 2

     int 0x80

     mov eax, 1
     int 0x80

 segment .bss
     val resb 1;   <------ Here

Note that I reserved only 1 byte inside the bss segment, but actually put 2 bytes (charcode for 1 and newline symbol) into the memory location. And the program worked fine. It printed 1 character and then newline.

But I expected segmentation fault. Why isn't it occured. We reserved only 1 byte, but put 2.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

输什么也不输骨气 2025-01-23 20:19:31

x86 与大多数其他现代架构一样,使用分页/虚拟内存进行内存保护。 x86(再次像许多其他架构一样),粒度为 4kiB。

将 4 字节存储到 val 不会出错,除非链接器恰好将其放置在页面的最后 3 个字节中,并且下一页未映射。

实际发生的情况是,您只需覆盖 val 之后的内容即可。在这种情况下,它只是页面末尾的未使用空间。如果 BSS 中还有其他静态存储位置,您将采用它们的值。 (如果您愿意,可以将它们称为“变量”,但“变量”的高级概念不仅仅意味着内存位置,变量可以存在于寄存器中并且永远不需要有地址。


)上面链接的文章,另请参阅:


但实际上将 2 个字节(1 的字符代码和换行符)放入内存位置。

mov [val], eax 是一个 4 字节存储。操作数大小由寄存器决定。如果您想进行 2 字节存储,请使用 mov [val], ax

有趣的事实:MASM 会就操作数大小不匹配发出警告或错误,因为它根据在符号名称后面保留空间的声明神奇地将大小与符号名称关联起来。 NASM 不会妨碍你,所以如果你写了 mov [val], 0x0A31,这将是一个错误。两个操作数都隐含大小,因此您需要 mov dword [val], 0x0A31 (或 wordbyte)。


val 放在页面末尾会出现段错误

由于某种原因,BSS 不会在 32 位二进制文​​件中从页面开头开始,但它位于页面开头附近。您没有链接到任何会占用 BSS 中大部分页面的其他内容。 nm bss-no-segfault 显示它位于 0x080490a8,4k 页为 0x1000 字节,因此 BSS 映射中的最后一个字节将为0x08049fff

当我向 .text 部分添加指令时,BSS 起始地址似乎会发生变化,因此链接器此处的选择可能与将内容打包到 ELF 可执行文件中有关。这没有多大意义,因为BSS并不存储在文件中,它只是基地址+长度。我不会掉进那个兔子洞;我确信使 .text 稍微大一点会导致 BSS 从页面开头开始,这是有原因的,但我不知道它是什么。

无论如何,如果我们构建 BSS 以使 val 位于页面末尾之前,我们可能会遇到错误:

... same .text

section .bss
dummy:  resb 4096 - 0xa8 - 2
val:    resb 1

;; could have done this instead of making up constants
;; ALIGN 4096
;; dummy2: resb 4094
;; val2:   resb

然后构建并运行:

$ asm-link -m32 bss-no-segfault.asm
+ yasm -felf32 -Worphan-labels -gdwarf2 bss-no-segfault.asm
+ ld -melf_i386 -o bss-no-segfault bss-no-segfault.o

peter@volta:~/src/SO$ nm bss-no-segfault
080490a7 B __bss_start
080490a8 b dummy
080490a7 B _edata
0804a000 B _end         <---------  End of the BSS
08048080 T _start
08049ffe b val          <---------  Address of val

 gdb ./bss-no-segfault

 (gdb) b _start
 (gdb) r
 (gdb) set disassembly-flavor intel
 (gdb) layout reg

 (gdb) p &val
 $2 = (<data variable, no debug info> *) 0x8049ffe
 (gdb) si    # and press return to repeat a couple times

mov [var], eax段错误,因为它跨越到未映射的页面。 mov [var], ax 可以工作(因为我将 var 放在页面末尾之前 2 个字节)。

此时,/proc//smaps 显示:

... the r-x private mapping for .text
08049000-0804a000 rwxp 00000000 00:15 2885598                            /home/peter/src/SO/bss-no-segfault
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
...
[vvar] and [vdso] pages exported by the kernel for fast gettimeofday / getpid

关键内容:rwxp 表示读/写/执行,并且是私有的。即使在第一条指令之前停止,不知何故它已经是“脏的”(即写入)。文本段也是如此,但这是 gdb 将指令更改为 int3 所期望的。

08049000-0804a000(以及映射的 4 kB 大小)向我们表明 BSS 仅映射了 1 个页面。没有数据段,只有文本和 BSS。

x86, like most other modern architectures, uses paging / virtual memory for memory protection. On x86 (again like many other architectures), the granularity is 4kiB.

A 4-byte store to val won't fault unless the linker happens to place it in the last 3 bytes of a page, and the next page is unmapped.

What actually happens is that you just overwrite whatever is after val. In this case, it's just unused space to the end of the page. If you had other static storage locations in the BSS, you'd step on their values. (Call them "variables" if you want, but the high-level concept of a "variable" doesn't just mean a memory location, a variable can be live in a register and never needs to have an address.)


Besides the wikipedia article linked above, see also:


but actually put 2 bytes (charcode for 1 and newline symbol) into the memory location.

mov [val], eax is a 4-byte store. The operand-size is determined by the register. If you wanted to do a 2-byte store, use mov [val], ax.

Fun fact: MASM would warn or error about an operand-size mismatch, because it magically associates sizes with symbol names based on the declaration that reserves space after them. NASM stays out of your way, so if you wrote mov [val], 0x0A31, it would be an error. Neither operand implies a size, so you need mov dword [val], 0x0A31 (or word or byte).


Placing val at the end of a page to get a segfault

The BSS for some reason doesn't start at the beginning of a page in a 32-bit binary, but it is near the start of a page. You're not linking with anything else that would use up most of a page in the BSS. nm bss-no-segfault shows that it's at 0x080490a8, and a 4k page is 0x1000 bytes, so the last byte in the BSS mapping will be 0x08049fff.

It seems that the BSS start address changes when I add an instruction to the .text section, so presumably the linker's choices here are related to packing things into an ELF executable. It doesn't make much sense, because the BSS isn't stored in the file, it's just a base address + length. I'm not going down that rabbit hole; I'm sure there's a reason that making .text slightly larger results in a BSS that starts at the beginning of a page, but IDK what it is.

Anyway, if we construct the BSS so that val is right before the end of a page, we can get a fault:

... same .text

section .bss
dummy:  resb 4096 - 0xa8 - 2
val:    resb 1

;; could have done this instead of making up constants
;; ALIGN 4096
;; dummy2: resb 4094
;; val2:   resb

Then build and run:

$ asm-link -m32 bss-no-segfault.asm
+ yasm -felf32 -Worphan-labels -gdwarf2 bss-no-segfault.asm
+ ld -melf_i386 -o bss-no-segfault bss-no-segfault.o

peter@volta:~/src/SO$ nm bss-no-segfault
080490a7 B __bss_start
080490a8 b dummy
080490a7 B _edata
0804a000 B _end         <---------  End of the BSS
08048080 T _start
08049ffe b val          <---------  Address of val

 gdb ./bss-no-segfault

 (gdb) b _start
 (gdb) r
 (gdb) set disassembly-flavor intel
 (gdb) layout reg

 (gdb) p &val
 $2 = (<data variable, no debug info> *) 0x8049ffe
 (gdb) si    # and press return to repeat a couple times

mov [var], eax segfaults because it crosses into the unmapped page. mov [var], ax would works (because I put var 2 bytes before the end of the page).

At this point, /proc/<PID>/smaps shows:

... the r-x private mapping for .text
08049000-0804a000 rwxp 00000000 00:15 2885598                            /home/peter/src/SO/bss-no-segfault
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
...
[vvar] and [vdso] pages exported by the kernel for fast gettimeofday / getpid

Key things: rwxp means read/write/execute, and private. Even stopped before the first instruction, somehow it's already "dirty" (i.e. written to). So is the text segment, but that's expected from gdb changing the instruction to int3.

The 08049000-0804a000 (and 4 kB size of the mapping) shows us that the BSS only has 1 page mapped. There's no data segment, just text and BSS.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文