由C++生成的GDB CoreFile segfault和pybinder没有符号
上下文:我有一个在服务器上运行的程序,该程序每月几次segfault。该程序是一个Python程序,它使用了在C ++中实现并由Pybinder公开的一些库。
我能够捕获服务器上的CoreFile,并且具有源代码(C ++和Python部分)。我想知道如何获得Segfault Stacktrace?
我尝试使用
使用
-G3
选项构建源代码(C ++部分)。据我了解,它应该具有与服务器上运行的二进制二进制文件和地址相同的地址。唯一的区别应该是符号表(以及Elf中的其他几个部分)。我尝试
gdb -ex r bazel -bin/username/coredump/capture_corefile/tmp/test_coredump/corefile.python.3861066
。bazel-bin/username/coredump/capture_corefile
是带有符号表的C ++中的Python脚本。/tmp/test_coredump/corefile.python.3861066
是我收集的Corefile。
但这表明
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f58ca51332b in ?? ()
Starting program:
No executable file specified.
- 我试图通过
llvm-symbolizer
直接获取代码行。 对于Python脚本作为对象,它直接失败。
desktop$ llvm-symbolizer --obj=bazel-bin/username/coredump/capture_corefile 0x00007f58ca51332b
LLVMSymbolizer: error reading file: The file was not recognized as a valid object file
??
??:0:0
对于共享对象,它也会失败:
desktop$ llvm-symbolizer --obj=bazel-bin/username/coredump/coredump_pybind.so 0x00007f58ca51332b
_fini
??:0:0
我确认符号表未剥离:
file bazel-bin/username/coredump/coredump_pybind.so
bazel-bin/experimental/hjiang/coredump/coredump_pybind.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[md5/uuid]=baf3b4d9a8f7955b5db6b977843e2eb0, not stripped
有人知道如何使用我拥有的所有内容获得堆栈?
context: I have a program which runs on server, which segfaults several times a month. The program is a python program which uses some library implemented in C++ and exposed by pybinder.
I am able to capture the corefile on server and I have the source code (both C++ and python part). I want to know how I can get the segfault stacktrace?
Several things I have tried to
build the source code (C++ part) with
-g3
option. From my understand, it should have the same binary and address as the one running on server. The only difference should be symbol table (and possibly several other sections in ELF).I tried to
gdb -ex r bazel-bin/username/coredump/capture_corefile /tmp/test_coredump/corefile.python.3861066
.bazel-bin/username/coredump/capture_corefile
is the python script in C++ with symbol table./tmp/test_coredump/corefile.python.3861066
is the corefile I have collected.
But it shows
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f58ca51332b in ?? ()
Starting program:
No executable file specified.
- I tried to directly get the line of code by
llvm-symbolizer
.
For python script as the object, it fails directly.
desktop$ llvm-symbolizer --obj=bazel-bin/username/coredump/capture_corefile 0x00007f58ca51332b
LLVMSymbolizer: error reading file: The file was not recognized as a valid object file
??
??:0:0
For shared object, it also fails:
desktop$ llvm-symbolizer --obj=bazel-bin/username/coredump/coredump_pybind.so 0x00007f58ca51332b
_fini
??:0:0
I confirm the symbol table is not stripped:
file bazel-bin/username/coredump/coredump_pybind.so
bazel-bin/experimental/hjiang/coredump/coredump_pybind.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[md5/uuid]=baf3b4d9a8f7955b5db6b977843e2eb0, not stripped
Does someone know how to get the stacktrace with everything I have?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这绝不是 ,实际上很难用GCC实现。
您没有提及所使用的编译器,但是如果它是clang的(通过以后使用
llvm-symbolizer
>),请注意,Clang当前没有和没有<的情况下没有产生相同的代码代码> -g 。此外,要进行这项工作,您需要保留所有最初使用的标志(包括所有优化标志) - 用
-O2
用-G3
- 替换-O2
- - 二进制会大不相同。您可以通过运行
nm Original.so
,nm replacement.o
来检查重建库是否有任何好处,并比较两个输出中出现的任何符号的地址。如果所有共同符号的地址匹配了替换。SO
是可用的。这里的最佳实践是通过优化和调试信息构建
.so
(例如gcc ... -g3 -o2 ...
), keep 该二进制文件用于将来的调试,但将strip
ed Ed二进制发送给服务器。这样,如果/当剥离二进制崩溃时,您可以保证拥有所需的确切二进制文件。上面命令询问
gdb
以运行core> core
文件,这是没有意义的。无论您在这里尝试实现什么,不是正确的方法。
另外,GDB(通常)如果您给它
core
- 对于大多数任务,您还需要二进制,从而产生core> core < /代码>。
您的第一步应该是获得崩溃堆栈跟踪,如所述,例如在这里。一旦拥有看起来像合理的堆栈跟踪的东西,就可以尝试交换
.so
的全debug版本。This is by no means guaranteed, and actually pretty hard to achieve with GCC.
You didn't mention the compiler you use, but if it is Clang (implied by your later use of
llvm-symbolizer
), then note that Clang currently doesn't produce the same code with and without-g
.In addition, to make this work, you need to keep all the flags originally used (including all optimization flags) -- it's no good to replace
-O2
with-g3
-- the binary will be vastly different.You can check whether your rebuilt library is any good by running
nm original.so
,nm replacement.so
, and comparing the addresses of any symbols which appear in both outputs. Thereplacement.so
is usable IFF all common symbol's addresses match.The best practice here is to build the
.so
with optimization and debug info (e.g.gcc ... -g3 -O2 ...
), keep that binary for future debugging, but send astrip
ed binary to the server. That way you are guaranteed to have the exact binary you need if/when the stripped binary crashes.The above command asks
gdb
to run acore
file, which makes no sense.Whatever you tried to achieve here, that isn't the right way to do it.
Also, GDB (in general) can't help if you give it only the
core
-- for most tasks you also need the binary which produced thatcore
.Your first step should be to get a crash stack trace, as described e.g. here. Once you have something that looks like a reasonable stack trace, you could try swapping full-debug version of
.so
.