gdb 奇怪的回溯
我的程序是用 Dietlibc 静态编译的。它在 ubuntu x64 上编译(使用 -m32 标志为 x86 编译)并在 centos x86 上运行。
编译后的大小只有100KB左右。我使用 -ggdb3 编译它,没有优化标志。
我的程序使用 signal.h 来处理 SIGSEGV 信号,然后调用 abort()。
该程序运行几天没有问题,但有时会出现段错误。这是当我收到我不理解的奇怪回溯时:
username@ubuntu:~/Desktop$ gdb -c core.28569 program-name GNU gdb (GDB) 7.2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-linux-gnu --target=i386-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from program-name...done. [New Thread 28569] Core was generated by `program-name'. Program terminated with signal 6, Aborted. #0 0x00914410 in __kernel_vsyscall () Setting up the environment for debugging gdb. Function "internal_error" not defined. Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal] Function "info_command" not defined. Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal] .gdbinit:8: Error in sourced command file: Argument required (one or more breakpoint numbers). (gdb) bt #0 0x00914410 in __kernel_vsyscall () During symbol reading, incomplete CFI data; unspecified registers (e.g., eax) at 0x914411. #1 0x0804d7f4 in __unified_syscall () #2 0xbf8966c0 in ?? () #3 #4 0x2054454e in ?? () #5 0x20524c43 in ?? () #6 0x2e352e33 in ?? () #7 0x32373033 in ?? () #8 0x2e203b39 in ?? () #9 0x2054454e in ?? () #10 0x20524c43 in ?? () #11 0x2e302e33 in ?? () #12 0x32373033 in ?? () #13 0x4d203b39 in ?? () #14 0x61696465 in ?? () #15 0x6e654320 in ?? () #16 0x20726574 in ?? () #17 0x36204350 in ?? () #18 0x203b302e in ?? () #19 0x54454e2e in ?? () #20 0x43302e34 in ?? () #21 0x00000029 in ?? () #22 0xbf8989a8 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) bt full #0 0x00914410 in __kernel_vsyscall () No symbol table info available. #1 0x0804d7f4 in __unified_syscall () No symbol table info available. #2 0xbf8966c0 in ?? () No symbol table info available. #3 No symbol table info available. #4 0x2054454e in ?? () No symbol table info available. #5 0x20524c43 in ?? () No symbol table info available. #6 0x2e352e33 in ?? () No symbol table info available. #7 0x32373033 in ?? () No symbol table info available. #8 0x2e203b39 in ?? () No symbol table info available. #9 0x2054454e in ?? () No symbol table info available. #10 0x20524c43 in ?? () No symbol table info available. #11 0x2e302e33 in ?? () No symbol table info available. #12 0x32373033 in ?? () No symbol table info available. #13 0x4d203b39 in ?? () No symbol table info available. #14 0x61696465 in ?? () No symbol table info available. #15 0x6e654320 in ?? () No symbol table info available. #16 0x20726574 in ?? () No symbol table info available. #17 0x36204350 in ?? () No symbol table info available. #18 0x203b302e in ?? () No symbol table info available. #19 0x54454e2e in ?? () No symbol table info available. #20 0x43302e34 in ?? () No symbol table info available. #21 0x00000029 in ?? () No symbol table info available. #22 0xbf8989a8 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) quit
My program is statically compiled with dietlibc. It is compiled on ubuntu x64 (compiled for x86 using the -m32 flag) and is run on a centos x86.
The compiled size is only about 100KB. I compile it with -ggdb3 and no optimization flags.
My program uses signal.h to handle a SIGSEGV signal and then calls abort().
The program runs without problems for days but sometimes segfaults. This is when I get weird backtraces that I do not understand:
username@ubuntu:~/Desktop$ gdb -c core.28569 program-name GNU gdb (GDB) 7.2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-linux-gnu --target=i386-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from program-name...done. [New Thread 28569] Core was generated by `program-name'. Program terminated with signal 6, Aborted. #0 0x00914410 in __kernel_vsyscall () Setting up the environment for debugging gdb. Function "internal_error" not defined. Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal] Function "info_command" not defined. Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal] .gdbinit:8: Error in sourced command file: Argument required (one or more breakpoint numbers). (gdb) bt #0 0x00914410 in __kernel_vsyscall () During symbol reading, incomplete CFI data; unspecified registers (e.g., eax) at 0x914411. #1 0x0804d7f4 in __unified_syscall () #2 0xbf8966c0 in ?? () #3 #4 0x2054454e in ?? () #5 0x20524c43 in ?? () #6 0x2e352e33 in ?? () #7 0x32373033 in ?? () #8 0x2e203b39 in ?? () #9 0x2054454e in ?? () #10 0x20524c43 in ?? () #11 0x2e302e33 in ?? () #12 0x32373033 in ?? () #13 0x4d203b39 in ?? () #14 0x61696465 in ?? () #15 0x6e654320 in ?? () #16 0x20726574 in ?? () #17 0x36204350 in ?? () #18 0x203b302e in ?? () #19 0x54454e2e in ?? () #20 0x43302e34 in ?? () #21 0x00000029 in ?? () #22 0xbf8989a8 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) bt full #0 0x00914410 in __kernel_vsyscall () No symbol table info available. #1 0x0804d7f4 in __unified_syscall () No symbol table info available. #2 0xbf8966c0 in ?? () No symbol table info available. #3 No symbol table info available. #4 0x2054454e in ?? () No symbol table info available. #5 0x20524c43 in ?? () No symbol table info available. #6 0x2e352e33 in ?? () No symbol table info available. #7 0x32373033 in ?? () No symbol table info available. #8 0x2e203b39 in ?? () No symbol table info available. #9 0x2054454e in ?? () No symbol table info available. #10 0x20524c43 in ?? () No symbol table info available. #11 0x2e302e33 in ?? () No symbol table info available. #12 0x32373033 in ?? () No symbol table info available. #13 0x4d203b39 in ?? () No symbol table info available. #14 0x61696465 in ?? () No symbol table info available. #15 0x6e654320 in ?? () No symbol table info available. #16 0x20726574 in ?? () No symbol table info available. #17 0x36204350 in ?? () No symbol table info available. #18 0x203b302e in ?? () No symbol table info available. #19 0x54454e2e in ?? () No symbol table info available. #20 0x43302e34 in ?? () No symbol table info available. #21 0x00000029 in ?? () No symbol table info available. #22 0xbf8989a8 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) quit
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是堆栈溢出。
看起来像文字,“十”或“NET”
“RLC”或“CLR”
等等。
将地址视为文本 - 看看您是否可以识别该文本覆盖堆栈的位置。
It's a stack overrun.
That looks like text, " TEN" or "NET "
" RLC" or "CLR "
And so on.
Treat the addresses as if they were text - see if you can identify where this text overwrites your stack.
你的堆栈跟踪实际上很容易理解:
abort()
raise(2)
系统调用,通过调用__unified_syscall()
在 GDB 中没有得到堆栈跟踪的原因是
__unified_syscall
是在汇编中实现的,并且我认为这是 Dietlibc 中的一个错误,实际上很容易修复。看看这个(未经测试的)补丁是否可以为您修复它:
如果您无法重建 Dietlibc,或者补丁不正确,您仍然可以更好地分析堆栈跟踪。据我所知,
__unified_syscall
并没有触及%ebp
。因此,您可以通过这样做来获得合理的堆栈跟踪:注意:如果
xbt
工作,它很可能会进入SIGSEGV周围的杂草
信号帧(该帧也不使用帧指针)。这可能会导致完全垃圾,或者跳过一两帧(这正是发生SIGSEGV
的帧)。因此,将适当的展开描述符放入 Dietlibc 中确实会更好。
Your stack trace is actually very easy to understand:
abort()
raise(2)
system call, by calling__unified_syscall()
The reason you get no stack trace in GDB is that
__unified_syscall
is implemented in assembly, andcfi
directives to describe how to unwind from it.I would consider this a bug in dietlibc, quite easy to fix, actually. See if this (untested) patch fixes it for you:
If you can't rebuild dietlibc, or if the patch is incorrect, you may still be able to analyze the stack trace better. As far as I can tell,
__unified_syscall
does not touch%ebp
. So you might be able to get a reasonable stack trace by doing this:Note: if the
xbt
works, it is likely to go into the weeds around theSIGSEGV
signal frame (that frame does not use frame pointer either). This may result in complete garbage, or in a skipped frame or two (which would be exactly the frames whereSIGSEGV
happened).So you really are much better off getting proper unwind descriptors into dietlibc.