我有一个可执行文件的核心转储,该可执行文件不是使用调试符号构建的。我可以恢复 argv 内容吗?

发布于 2025-01-01 02:34:59 字数 335 浏览 3 评论 0原文

我有一个可执行文件的核心转储,该可执行文件不是使用调试符号构建的。

我可以恢复 argv 内容以查看命令行是什么吗?

如果我运行 gdb,我可以看到回溯,并且可以导航到 main() 框架。一旦到达那里,有没有办法在不知道其确切地址的情况下恢复 argv?

我在 x86_x64(Intel Xeon CPU)上运行 CEntOS Linux 发行版/内核,

我希望的原因之一是核心转储似乎显示部分 argv。

(该程序是 postgres,当我加载核心文件时,gdb 会打印一条消息,其中包括 postgres db 用户名、客户端 OP 地址和查询的前 10 个字符))

I have a core dump of an executable that was NOT built with debug symbols.

Can I recover argv contents to see what the command line was?

If I run gdb, I can see a backtrace, and I can navigate to the main() frame. Once there, is there a way to recover argv, without knowing its exact address?

I am on x86_x64 (Intel Xeon CPU) running a CEntOS Linux distro/kernel,

One reason I am hopeful is that the core dump seems to show a partial argv.

(The program is postgres, and when I load the core file, gdb prints a message that includes the postgres db-user name, client OP address, and first 10 characters of the query))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

伤感在游骋 2025-01-08 02:34:59

x86_64 上,参数在 %rdi%rsi 等寄存器中传递(调用约定)。

因此,当您进入 main 框架时,您应该能够:

(gdb) p $rdi           # == argc
(gdb) p (char**) $rsi  # == argv

(gdb) set $argv = (char**)$rsi
(gdb) set $i = 0
(gdb) while $argv[$i]
> print $argv[$i++]
> end

不幸的是,GDB 通常不会在以下情况下恢复 $rdi$rsi:你切换帧。所以这个例子不起作用:

cat t.c

#include <stdlib.h>

int bar() { abort(); }
int foo() { return bar(); }
int main()
{
  foo();
  return 0;
}

gcc t.c && ./a.out
Aborted (core dumped)

gdb -q ./a.out core
Core was generated by `./a.out'.
Program terminated with signal 6, Aborted.
#0  0x00007fdc8284aa75 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
    in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x00007fdc8284aa75 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fdc8284e5c0 in *__GI_abort () at abort.c:92
#2  0x000000000040052d in bar ()
#3  0x000000000040053b in foo ()
#4  0x000000000040054b in main ()
(gdb) fr 4
#4  0x000000000040054b in main ()
(gdb) p $rdi
$1 = 5524    ### clearly not the right value

所以你必须做更多的工作......

可以做的是使用Linux堆栈如何设置的知识进程启动,结合 GDB 恢复堆栈指针的事实:

(gdb) set backtrace past-main
(gdb) bt
#0  0x00007ffff7a8da75 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff7a915c0 in *__GI_abort () at abort.c:92
#2  0x000000000040052d in bar ()
#3  0x000000000040053b in foo ()
#4  0x0000000000400556 in main ()
#5  0x00007ffff7a78c4d in __libc_start_main (main=<optimized out>, argc=<optimized out>, ubp_av=<optimized out>, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdad8) at libc-start.c:226
#6  0x0000000000400469 in _start ()

(gdb) frame 6
(gdb) disas
Dump of assembler code for function _start:
   0x0000000000400440 <+0>: xor    %ebp,%ebp
   0x0000000000400442 <+2>: mov    %rdx,%r9
   0x0000000000400445 <+5>: pop    %rsi
   0x0000000000400446 <+6>: mov    %rsp,%rdx
   0x0000000000400449 <+9>: and    $0xfffffffffffffff0,%rsp
   0x000000000040044d <+13>:    push   %rax
   0x000000000040044e <+14>:    push   %rsp
   0x000000000040044f <+15>:    mov    $0x400560,%r8
   0x0000000000400456 <+22>:    mov    $0x400570,%rcx
   0x000000000040045d <+29>:    mov    $0x40053d,%rdi
   0x0000000000400464 <+36>:    callq  0x400428 <__libc_start_main@plt>
=> 0x0000000000400469 <+41>:    hlt    
   0x000000000040046a <+42>:    nop
   0x000000000040046b <+43>:    nop
End of assembler dump.

所以现在我们期望原来的 %rsp$rsp+8(一次 POP,两次 PUSH),但由于在指令 0x0000000000400449 处进行了对齐,因此它可能位于 $rsp+16 >

让我们看看那里有什么...

(gdb) x/8gx $rsp+8
0x7fffbe5d5e98: 0x000000000000001c  0x0000000000000004
0x7fffbe5d5ea8: 0x00007fffbe5d6eb8  0x00007fffbe5d6ec0
0x7fffbe5d5eb8: 0x00007fffbe5d6ec4  0x00007fffbe5d6ec8
0x7fffbe5d5ec8: 0x0000000000000000  0x00007fffbe5d6ecf

看起来很有希望:4 个(可疑的 argc),后面跟着 4 个非 NULL 指针,后面跟着 NULL。

让我们看看这是否成功:

(gdb) x/s 0x00007fffbe5d6eb8
0x7fffbe5d6eb8:  "./a.out"
(gdb) x/s 0x00007fffbe5d6ec0
0x7fffbe5d6ec0:  "foo"
(gdb) x/s 0x00007fffbe5d6ec4
0x7fffbe5d6ec4:  "bar"
(gdb) x/s 0x00007fffbe5d6ec8
0x7fffbe5d6ec8:  "bazzzz"

确实,这就是我调用二进制文件的方式。作为最终的健全性检查,0x00007fffbe5d6ecf 看起来像环境的一部分吗?

(gdb) x/s 0x00007fffbe5d6f3f
0x7fffbe5d6f3f:  "SSH_AGENT_PID=2874"

是的,这就是环境的开始(或结束)。

这样你就得到了。

最后注意事项:如果 GDB 没有打印太多 ,我们可以从第 5 帧恢复 argcargv。 GDB 和 GCC 双方都在努力使 GDB 打印更少的“优化”...

此外,在加载核心时,我的 GDB 打印:

Core was generated by `./a.out foo bar bazzzz'.

否定整个练习的需要。但是,这仅适用于短命令行,而上面的解决方案适用于任何命令行。

On x86_64 the arguments are passed in %rdi, %rsi, etc. registers (calling convention).

Therefore, when you step into the main frame, you should be able to:

(gdb) p $rdi           # == argc
(gdb) p (char**) $rsi  # == argv

(gdb) set $argv = (char**)$rsi
(gdb) set $i = 0
(gdb) while $argv[$i]
> print $argv[$i++]
> end

Unfortunately, GDB will not normally restore $rdi and $rsi when you switch frames. So this example doesn't work:

cat t.c

#include <stdlib.h>

int bar() { abort(); }
int foo() { return bar(); }
int main()
{
  foo();
  return 0;
}

gcc t.c && ./a.out
Aborted (core dumped)

gdb -q ./a.out core
Core was generated by `./a.out'.
Program terminated with signal 6, Aborted.
#0  0x00007fdc8284aa75 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
    in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x00007fdc8284aa75 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fdc8284e5c0 in *__GI_abort () at abort.c:92
#2  0x000000000040052d in bar ()
#3  0x000000000040053b in foo ()
#4  0x000000000040054b in main ()
(gdb) fr 4
#4  0x000000000040054b in main ()
(gdb) p $rdi
$1 = 5524    ### clearly not the right value

So you'll have to work some more ...

What you can do is use the knowledge of how Linux stack is set up at process startup, combined with the fact that GDB will restore stack pointer:

(gdb) set backtrace past-main
(gdb) bt
#0  0x00007ffff7a8da75 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff7a915c0 in *__GI_abort () at abort.c:92
#2  0x000000000040052d in bar ()
#3  0x000000000040053b in foo ()
#4  0x0000000000400556 in main ()
#5  0x00007ffff7a78c4d in __libc_start_main (main=<optimized out>, argc=<optimized out>, ubp_av=<optimized out>, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdad8) at libc-start.c:226
#6  0x0000000000400469 in _start ()

(gdb) frame 6
(gdb) disas
Dump of assembler code for function _start:
   0x0000000000400440 <+0>: xor    %ebp,%ebp
   0x0000000000400442 <+2>: mov    %rdx,%r9
   0x0000000000400445 <+5>: pop    %rsi
   0x0000000000400446 <+6>: mov    %rsp,%rdx
   0x0000000000400449 <+9>: and    $0xfffffffffffffff0,%rsp
   0x000000000040044d <+13>:    push   %rax
   0x000000000040044e <+14>:    push   %rsp
   0x000000000040044f <+15>:    mov    $0x400560,%r8
   0x0000000000400456 <+22>:    mov    $0x400570,%rcx
   0x000000000040045d <+29>:    mov    $0x40053d,%rdi
   0x0000000000400464 <+36>:    callq  0x400428 <__libc_start_main@plt>
=> 0x0000000000400469 <+41>:    hlt    
   0x000000000040046a <+42>:    nop
   0x000000000040046b <+43>:    nop
End of assembler dump.

So now we expect the original %rsp to be $rsp+8 (one POP, two PUSHes), but it could be at $rsp+16 due to alignment that was done at instruction 0x0000000000400449

Let's see what's there ...

(gdb) x/8gx $rsp+8
0x7fffbe5d5e98: 0x000000000000001c  0x0000000000000004
0x7fffbe5d5ea8: 0x00007fffbe5d6eb8  0x00007fffbe5d6ec0
0x7fffbe5d5eb8: 0x00007fffbe5d6ec4  0x00007fffbe5d6ec8
0x7fffbe5d5ec8: 0x0000000000000000  0x00007fffbe5d6ecf

That looks promising: 4 (suspected argc), followed by 4 non-NULL pointers, followed by NULL.

Let's see if that pans out:

(gdb) x/s 0x00007fffbe5d6eb8
0x7fffbe5d6eb8:  "./a.out"
(gdb) x/s 0x00007fffbe5d6ec0
0x7fffbe5d6ec0:  "foo"
(gdb) x/s 0x00007fffbe5d6ec4
0x7fffbe5d6ec4:  "bar"
(gdb) x/s 0x00007fffbe5d6ec8
0x7fffbe5d6ec8:  "bazzzz"

Indeed, that's how I invoked the binary. As a final sanity check, does 0x00007fffbe5d6ecf look like part of the enovironment?

(gdb) x/s 0x00007fffbe5d6f3f
0x7fffbe5d6f3f:  "SSH_AGENT_PID=2874"

Yep, that's the beginning (or the end) of the environment.

So there you have it.

Final notes: if GDB didn't print <optimized out> so much, we could have recovered argc and argv from frame #5. There is work on both GDB and GCC sides to make GDB print much less of "optimized out" ...

Also, when loading the core, my GDB prints:

Core was generated by `./a.out foo bar bazzzz'.

negating the need for this whole exercise. However, that only works for short command lines, while the solution above will work for any command line.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文