获取主可执行文件的 ELF 标头

发布于 2024-12-27 07:48:26 字数 542 浏览 0 评论 0原文

出于各种目的,我试图在不解析 /proc/self/maps 的情况下获取主可执行文件的 ELF 标头的地址。我尝试解析由 dlopen/dlinfo 函数给出的 link_list 链,但它们不包含 l_addr 的条目指向主可执行文件的基地址。有没有办法在不解析 /proc/self/maps 的情况下执行此操作(标准或非标准)?

我正在尝试做的一个例子:

#include <stdio.h>
#include <elf.h>
int main()
{
    Elf32_Ehdr* header = /* Somehow obtain the address of the ELF header of this program */;
    printf("%p\n", header);
    /* Read the header and do stuff, etc */
    return 0;
}

For various purposes, I am trying to obtain the address of the ELF header of the main executable without parsing /proc/self/maps. I have tried parsing the link_list chain given by dlopen/dlinfo functions but they do not contain an entry where l_addr points to the base address of the main executable. Is there any way to do this (Standard or not) without parsing /proc/self/maps?

An example of what I'm trying to do:

#include <stdio.h>
#include <elf.h>
int main()
{
    Elf32_Ehdr* header = /* Somehow obtain the address of the ELF header of this program */;
    printf("%p\n", header);
    /* Read the header and do stuff, etc */
    return 0;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

只是我以为 2025-01-03 07:48:26

dlopen(0, RTLD_LAZY) 返回的 void * 指针为您提供一个 struct link_map *,它对应于主可执行文件。

调用dl_iterate_phdr还会在第一次执行回调时返回主可执行文件的条目。

您可能会对链接映射中的 .l_addr == 0 以及使用 dl_iterate_phdr 时的 dlpi_addr == 0 感到困惑。

发生这种情况是因为 l_addr (和 dlpi_addr)实际上并不记录 ELF 图像的加载地址。相反,它们记录已应用于该图像的重定位

通常,主可执行文件被构建为在 0x400000(对于 x86_64 Linux)或 0x08048000(对于 ix86 Linux)加载,并且加载在同一地址(即它们不是搬迁)。

但是,如果您使用 -pie 标志链接可执行文件,那么它将被链接到 0x0并且它将被重新定位到其他地址。

那么如何到达 ELF 头呢?

2023 年更新:

这不是一个更简单的方法(如果依赖未记录的详细信息),只需在 struct link_map 中的 l_ld 地址上调用 dladdr ,然后使用 dli_fbase ? ——西蒙·基萨内

确实如此。这是更简单的解决方案:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <link.h>
#include <stdio.h>

int main()
{
  void *dyn = _DYNAMIC;
  Dl_info info;
  if (dladdr(dyn, &info) != 0) {
    printf("a.out loaded at %p\n", info.dli_fbase);
  }
  return 0;
}
gcc -g -Wall -Wextra x.c -ldl && ./a.out
a.out loaded at 0x556433ea0000  # high address here because my GCC defaults to PIE.

gcc -g -Wall -Wextra x.c -ldl -no-pie && ./a.out
a.out loaded at 0x400000

gcc -g -Wall -Wextra x.c -ldl -no-pie -m32 && ./a.out
a.out loaded at 0x8048000

2012 年原始解决方案:

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif

#include <link.h>
#include <stdio.h>
#include <stdlib.h>

static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
  int j;
  static int once = 0;

  if (once) return 0;
  once = 1;

  printf("relocation: 0x%lx\n", (long)info->dlpi_addr);

  for (j = 0; j < info->dlpi_phnum; j++) {
    if (info->dlpi_phdr[j].p_type == PT_LOAD) {
      printf("a.out loaded at %p\n",
             (void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));
      break;
    }
  }
  return 0;
}

int
main(int argc, char *argv[])
{
  dl_iterate_phdr(callback, NULL);
  exit(EXIT_SUCCESS);
}

    
$ gcc -m32 t.c && ./a.out
relocation: 0x0
a.out loaded at 0x8048000

$ gcc -m64 t.c && ./a.out
relocation: 0x0
a.out loaded at 0x400000

$ gcc -m32 -pie -fPIC t.c && ./a.out
relocation: 0xf7789000
a.out loaded at 0xf7789000

$ gcc -m64 -pie -fPIC t.c && ./a.out
relocation: 0x7f3824964000
a.out loaded at 0x7f3824964000

更新:

为什么手册页说“基地址”而不是重定位?

这是一个错误;-)

我猜测手册页是在 prelinkpie 以及 ASLR 存在之前编写的。如果没有预链接,共享库总是链接到地址0x0加载,然后重定位基地址变得一模一样。

当 info 引用主可执行文件时,为什么 dlpi_name 指向空字符串?

这是执行过程中的一个意外。

其工作方式是,内核 open(2) 执行可执行文件并将打开的文件描述符传递给加载器(在 auxv[] 向量中,如 AT_EXECFD)。加载程序通过读取该文件描述符所获取的可执行文件的所有信息。

在 UNIX 上没有简单的方法可以将文件描述符映射回其打开时的名称。一方面,UNIX 支持硬链接,并且可能有多个文件名引用同一文件。

较新的 Linux 内核还传入用于 execve(2) 可执行文件的名称(也在 auxv[] 中,如 AT_EXECFN)。但这是可选的,即使它被传入,glibc 也不会将其放入 .l_name / dlpi_name 中,以免破坏依赖于名称为空。

相反,glibc 将该名称保存在 __progname__progname_full 中。

加载程序可以 readlink(2) 未使用 AT_EXECFN 的系统上的 /proc/self/exe 名称code>,但 /proc 文件系统也不能保证被挂载,因此有时仍然会留下一个空名称。

The void * pointer returned by dlopen(0, RTLD_LAZY) gives you a struct link_map *, that corresponds to the main executable.

Calling dl_iterate_phdr also returns the entry for the main executable on the very first execution of callback.

You are likely confused by the fact that .l_addr == 0 in the link map, and that dlpi_addr == 0 when using dl_iterate_phdr.

This is happening, because l_addr (and dlpi_addr) don't actually record the load address of an ELF image. Rather, they record the relocation that has been applied to that image.

Usually the main executable is built to load at 0x400000 (for x86_64 Linux) or at 0x08048000 (for ix86 Linux), and are loaded at that same address (i.e. they are not relocated).

But if you link your executable with -pie flag, then it will be linked-at 0x0, and it will be relocated to some other address.

So how do you get to the ELF header?

2023 Update:

Isn't a simpler method (if relying on undocumented details), just to call dladdr on the l_ld address in the struct link_map, and then use dli_fbase out of that? – Simon Kissane

Indeed it is. Here is much simpler solution:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <link.h>
#include <stdio.h>

int main()
{
  void *dyn = _DYNAMIC;
  Dl_info info;
  if (dladdr(dyn, &info) != 0) {
    printf("a.out loaded at %p\n", info.dli_fbase);
  }
  return 0;
}
gcc -g -Wall -Wextra x.c -ldl && ./a.out
a.out loaded at 0x556433ea0000  # high address here because my GCC defaults to PIE.

gcc -g -Wall -Wextra x.c -ldl -no-pie && ./a.out
a.out loaded at 0x400000

gcc -g -Wall -Wextra x.c -ldl -no-pie -m32 && ./a.out
a.out loaded at 0x8048000

Original 2012 solution:

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif

#include <link.h>
#include <stdio.h>
#include <stdlib.h>

static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
  int j;
  static int once = 0;

  if (once) return 0;
  once = 1;

  printf("relocation: 0x%lx\n", (long)info->dlpi_addr);

  for (j = 0; j < info->dlpi_phnum; j++) {
    if (info->dlpi_phdr[j].p_type == PT_LOAD) {
      printf("a.out loaded at %p\n",
             (void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));
      break;
    }
  }
  return 0;
}

int
main(int argc, char *argv[])
{
  dl_iterate_phdr(callback, NULL);
  exit(EXIT_SUCCESS);
}

    
$ gcc -m32 t.c && ./a.out
relocation: 0x0
a.out loaded at 0x8048000

$ gcc -m64 t.c && ./a.out
relocation: 0x0
a.out loaded at 0x400000

$ gcc -m32 -pie -fPIC t.c && ./a.out
relocation: 0xf7789000
a.out loaded at 0xf7789000

$ gcc -m64 -pie -fPIC t.c && ./a.out
relocation: 0x7f3824964000
a.out loaded at 0x7f3824964000

Update:

Why does the man page say "base address" and not relocation?

It's a bug ;-)

I am guessing that the man page was written long before prelink and pie, and ASLR existed. Without prelink, shared libraries are always linked to load at address 0x0, and then relocation and base address become one and the same.

how come dlpi_name points to an empty string when info refers to the main executable?

It's an accident of implementation.

The way this works, is that the kernel open(2)s the executable and passes the open file descriptor to the loader (in the auxv[] vector, as AT_EXECFD). Everything the loader knows about the executable it gets by reading that file descriptor.

There is no easy way on UNIX to map a file descriptor back to the name it was opened as. For one thing, UNIX supports hard-links, and there could be multiple filenames that refer to the same file.

Newer Linux kernels also pass in the name that was used to execve(2) the executable (also in auxv[], as AT_EXECFN). But that is optional, and even when it is passed in, glibc doesn't put it into .l_name / dlpi_name in order to not break existing programs which became dependent on the name being empty.

Instead, glibc saves that name in __progname and __progname_full.

The loader coud readlink(2) the name from /proc/self/exe on systems that didn't use AT_EXECFN, but the /proc file system is not guaranteed to be mounted either, so that would still leave it with an empty name sometimes.

猫九 2025-01-03 07:48:26

有 glibc dl_iterate_phdr() 函数。我不确定它是否完全满足您的需求,但据我所知,这很接近:

“dl_iterate_phdr() 函数允许应用程序在运行时查询以找出它已加载的共享对象。”
http://linux.die.net/man/3/dl_iterate_phdr

There is the glibc dl_iterate_phdr() function. I'm not sure it gives you exactly what you want, but that is as close as I know:

"The dl_iterate_phdr() function allows an application to inquire at run time to find out which shared objects it has loaded."
http://linux.die.net/man/3/dl_iterate_phdr

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文