GLIBC SCANF分割故障从不对RSP的函数调用
编译下面的代码:
global main
extern printf, scanf
section .data
msg: db "Enter a number: ",10,0
format:db "%d",0
section .bss
number resb 4
section .text
main:
mov rdi, msg
mov al, 0
call printf
mov rsi, number
mov rdi, format
mov al, 0
call scanf
mov rdi,format
mov rsi,[number]
inc rsi
mov rax,0
call printf
ret
使用:
nasm -f elf64 example.asm -o example.o
gcc -no-pie -m64 example.o -o example
然后运行
./example
它的运行,打印:输入数字: 但是随后崩溃并打印: 分割故障(核心倾倒),
因此printf可以正常工作,但scanf却不能。 我在Scanf上做错了什么?
When compiling below code:
global main
extern printf, scanf
section .data
msg: db "Enter a number: ",10,0
format:db "%d",0
section .bss
number resb 4
section .text
main:
mov rdi, msg
mov al, 0
call printf
mov rsi, number
mov rdi, format
mov al, 0
call scanf
mov rdi,format
mov rsi,[number]
inc rsi
mov rax,0
call printf
ret
using:
nasm -f elf64 example.asm -o example.o
gcc -no-pie -m64 example.o -o example
and then run
./example
it runs, print: enter a number:
but then crashes and prints:
Segmentation fault (core dumped)
So printf works fine but scanf not.
What am I doing wrong with scanf so?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用
sub rsp,8
/在功能的开始/结束处添加RSP,8
在您的启动/结束时将堆栈重新列入16个字节之前函数做呼叫
。或者更好地按/弹出虚拟寄存器,例如
按下RDX
/pop rcx
或像RBP一样,您实际上想保存的呼叫寄存器。 您需要总的更改为RSP是8计数的奇数倍数,并且sub rsp
,从函数输入到任何call>呼叫
。<<。 br>IE
8 + 16*n
整数的字节n
。在功能输入时,RSP距离16字节对齐为8个字节,因为
呼叫
按下了8字节返回地址。 See Printing floating point x86-64的数字似乎需要保存%rbp ,main and stack Alignment 和使用GNU汇编器中的x86_64中调用printf 。这是一个ABI要求,您过去可以在没有任何FP ARG的printf时逃脱违规。但不再是。
另请参见为什么X86-64/AMD64系统v ABI授权16字节堆栈对齐方式?
换句话说,
rsp%16 == 8
在功能输入中,您需要在调用
函数之前确保
RSP%16 == 0
。您如何做这没关系。 (如果您不这样做,并非所有功能实际上都会崩溃,但是ABI确实需要/保证。)GCC的glibc
scanf
(以及最近的printf ) Code>)现在,即使
al == 0
,也取决于16字节堆栈对齐。
它似乎已自动归纳化复制16个字节在
__ GI__IO_VFSCANF
中的某个地方,该字节在将其寄存器ARG溢出到stack 1 后,常规scanf
调用。 (呼叫SCANF的许多类似方法共享一个大实施,作为各种LIBC入口点的后端,例如scanf
,fscanf
等)我下载了Ubuntu 18.04's libc6二进制软件包: https://packages.ubuntu.com/bionic.com/bionic/bionic/amd64/libc6/下载并提取文件(使用
7z x blah.deb
和tar xf data.tar
,因为7z知道如何提取大量文件格式)。我可以用
ld_library_path =/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf
ld_library_path =/tmp/bionic-libc/x86_64-linux-gnu。 Linux桌面。
使用GDB,我将其运行在您的程序上,然后
设置ENV LD_LIBRARY_PATH/TMP/BIONIC-LIBC/LIB/lib/X86_64-Linux-GNU
然后Run
。使用布局reg
,拆卸窗口在接收到sigsegv的点上看起来像这样:因此,它将两个8字节对象复制到使用
movq
+movhps的堆栈中
加载和移动
存储。但是,随着堆栈未对准,移动[RBP-0x470],XMM0
故障。我没有抓住调试构建以确切找出C源的哪一部分将其转化为此,但是该功能是用C编写的,并由GCC编写并启用了优化。 GCC一直被允许这样做,但直到最近才变得足够聪明,可以更好地利用SSE2。
脚注1:带有
al!= 0 < / code>的printf / scanf始终需要16字节对齐,因为GCC的variadic函数代码 - 基因使用test Al,al / je来溢出完整的16字节XMM XMM XMM0 ..7在这种情况下,有对齐的商店。
__ M128i
可以是变异函数的参数,而不仅仅是double
,并且GCC不会检查该函数是否实际读取任何16字节FP ARGS。Use
sub rsp, 8
/add rsp, 8
at the start/end of your function to re-align the stack to 16 bytes before your function does acall
.Or better push/pop a dummy register, e.g.
push rdx
/pop rcx
, or a call-preserved register like RBP you actually wanted to save anyway. You need the total change to RSP to be an odd multiple of 8 counting all pushes andsub rsp
, from function entry to anycall
.i.e.
8 + 16*n
bytes for whole numbern
.On function entry, RSP is 8 bytes away from 16-byte alignment because the
call
pushed an 8-byte return address. See Printing floating point numbers from x86-64 seems to require %rbp to be saved,main and stack alignment, and Calling printf in x86_64 using GNU assembler. This is an ABI requirement which you used to be able to get away with violating when there weren't any FP args for printf. But not any more.
See also Why does the x86-64 / AMD64 System V ABI mandate a 16 byte stack alignment?
To put it another way,
RSP % 16 == 8
on function entry, and you need to ensureRSP % 16 == 0
before youcall
a function. How you do this doesn't matter. (Not all functions will actually crash if you don't, but the ABI does require/guarantee it.)gcc's code-gen for glibc
scanf
(and more recentlyprintf
) now depends on 16-byte stack alignment even whenAL == 0
.It seems to have auto-vectorized copying 16 bytes somewhere in
__GI__IO_vfscanf
, which regularscanf
calls after spilling its register args to the stack1. (The many similar ways to call scanf share one big implementation as a back end to the various libc entry points likescanf
,fscanf
, etc.)I downloaded Ubuntu 18.04's libc6 binary package: https://packages.ubuntu.com/bionic/amd64/libc6/download and extracted the files (with
7z x blah.deb
andtar xf data.tar
, because 7z knows how to extract a lot of file formats).I can repro your bug with
LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf
, and also it turns out with the system glibc 2.27-3 on my Arch Linux desktop.With GDB, I ran it on your program and did
set env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu
thenrun
. Withlayout reg
, the disassembly window looks like this at the point where it received SIGSEGV:So it copied two 8-byte objects to the stack with
movq
+movhps
to load andmovaps
to store. But with the stack misaligned,movaps [rbp-0x470],xmm0
faults.I didn't grab a debug build to find out exactly which part of the C source turned into this, but the function is written in C and compiled by GCC with optimization enabled. GCC has always been allowed to do this, but only recently did it get smart enough to take better advantage of SSE2 this way.
Footnote 1: printf / scanf with
AL != 0
has always required 16-byte alignment because gcc's code-gen for variadic functions uses test al,al / je to spill the full 16-byte XMM regs xmm0..7 with aligned stores in that case.__m128i
can be an argument to a variadic function, not justdouble
, and gcc doesn't check whether the function ever actually reads any 16-byte FP args.如上所述,我将有问题的说明追溯到:
glibc内部,在strops.c中。它做了128位移至堆栈。
修复了我的堆栈对齐后,我发现进入MAIN:
GCC生成的启动将控制权传递到我的程序中,该堆栈没有对准16个字节!
我使用了对准器:
这看起来很浪费,但是意识到堆栈必须按照定义对齐64位,因此这要么要在进入时将堆栈倒入0或64位。
呼叫指令将在堆栈上放置8个字节(RIP),这意味着呼叫将在16个字节的要求上固有地将堆栈不一致。该(在GCC中)由框架序列固定:
因为推动甚至将堆叠到16个字节。
As above, I traced down the offending instruction as:
Inside of glibc, in strops.c. Its doing a 128 bit move to the stack.
After fixing my stack alignments, I found that on entry to main:
The gcc generated startup passes control to my program with a stack that isn't aligned to 16 bytes!
I used the aligner:
This appears wasteful, but realize that the stack must be 64 bit aligned by definition, so this is either going to dump 0 or 64 bits of stack on entry.
A call instruction is going to place 8 bytes on the stack (the rip), which means that a call is going to inherently UN-align the stack as far as the 16 byte requirement. This (in gcc) is fixed by the framing sequence:
Because the push is going to even up the stack to 16 bytes.