获取可执行文件中文本部分的起始和结束地址

发布于 2024-12-03 15:20:22 字数 317 浏览 1 评论 0原文

我需要获取可执行文件文本部分的开始和结束地址。我怎样才能得到它?

我可以从 _init 符号或 _start 符号获取起始地址,但是结束地址呢?我是否应该将 text 部分的结束地址视为 .rodata 部分开始之前的最后一个地址?

或者我应该编辑默认的 ld 脚本并添加自己的符号来指示文本部分的开始和结束,并在编译时将其传递给 GCC?在这种情况下,我应该在哪里放置新符号,我应该考虑 init 和 fini 部分吗?

获取文本部分的开始和结束地址的好方法是什么?

I need to get the start and end address of an executable's text section. How can I get it?

I can get the starting address from the _init symbol or the _start symbol, but what about the ending address? Shall I consider the ending address of the text section to be the last address before starting of the .rodata section?

Or shall I edit the default ld script and add my own symbols to indicate the start and end of the text section, and pass it to GCC when compiling? In this case, where shall I place the new symbols, shall I consider the init and fini section?

What is a good way to get the start and end address of the text section?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

↙厌世 2024-12-10 15:20:22

基于 ELF 的平台的 GNU binutils 默认链接器脚本通常定义大量不同的符号,可用于查找各个部分的开始和结束。

文本部分的末尾通常通过选择三种不同的符号来引用:etext_etext__etext;可以通过 __executable_start 找到开始。 (请注意,这些符号通常使用 PROVIDE() 机制导出,这意味着它们将被覆盖如果可执行文件中的其他内容定义它们而不是仅仅引用它们,特别是意味着_etext__etext。可能是比etext。)

示例:

$ cat etext.c
#include <stdio.h>

extern char __executable_start;
extern char __etext;

int main(void)
{
  printf("0x%lx\n", (unsigned long)&__executable_start);
  printf("0x%lx\n", (unsigned long)&__etext);
  return 0;
}
$ gcc -Wall -o etext etext.c
$ ./etext
0x8048000
0x80484a0
$

我不相信任何标准都指定了这些符号,因此不应假定它们是可移植的(我不知道甚至 GNU binutils 是否也提供了它们所有基于 ELF 的平台,或者提供的符号集是否在不同的 binutils 版本中发生了变化),尽管我猜测是否 a) 你正在做一些需要这些信息的事情,b) 你正在考虑黑客链接描述文件作为一个选项,然后是可移植性不用太担心!

要查看在特定平台上构建特定事物时获得的确切符号集,请将 --verbose 标志赋予 ld (或 -Wl,- -verbosegcc)来打印它选择使用的链接器脚本(实际上有几种不同的默认链接器脚本,它们根据链接器选项和您正在构建的对象类型而变化)。

The GNU binutils default linker scripts for ELF-based platforms normally define quite a number of different symbols which can be used to find the start and end of various sections.

The end of the text section is usually referenced by a choice of three different symbols: etext, _etext or __etext; the start can be found as __executable_start. (Note that these symbols are usually exported using the PROVIDE() mechanism, which means that they will be overridden if something else in your executable defines them rather than merely referencing them. In particular that means that _etext or __etext are likely to be safer choices than etext.)

Example:

$ cat etext.c
#include <stdio.h>

extern char __executable_start;
extern char __etext;

int main(void)
{
  printf("0x%lx\n", (unsigned long)&__executable_start);
  printf("0x%lx\n", (unsigned long)&__etext);
  return 0;
}
$ gcc -Wall -o etext etext.c
$ ./etext
0x8048000
0x80484a0
$

I don't believe that any of these symbols are specified by any standard, so this shouldn't be assumed to be portable (I have no idea whether even GNU binutils provides them for all ELF-based platforms, or whether the set of symbols provided has changed over different binutils versions), although I guess if a) you are doing something that needs this information, and b) you're considering hacked linker scripts as an option, then portability isn't too much of a concern!

To see the exact set of symbols you get when building a particular thing on a particular platform, give the --verbose flag to ld (or -Wl,--verbose to gcc) to print the linker script it chooses to use (there are really several different default linker scripts, which vary according to linker options and the type of object you're building).

枯寂 2024-12-10 15:20:22

谈论“the”文本段是不正确的,因为可能有多个文本段(当您拥有共享库时通常情况下可以保证,但单个 ELF 二进制文件仍然可能有多个 PT_LOAD无论如何具有相同标志的部分)。

以下示例程序转储 dl_iterate_phr 返回的所有信息。您对带有 PF_X 标志的 PT_LOAD 类型的任何段感兴趣(请注意,如果 -z execstack,PT_GNU_STACK 将包含该标志 被传递给链接器,因此您确实必须检查两者)。

#define _GNU_SOURCE
#include <link.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

const char *type_str(ElfW(Word) type)
{
    switch (type)
    {
    case PT_NULL:
        return "PT_NULL"; // should not be seen at runtime, only in the file!
    case PT_LOAD:
        return "PT_LOAD";
    case PT_DYNAMIC:
        return "PT_DYNAMIC";
    case PT_INTERP:
        return "PT_INTERP";
    case PT_NOTE:
        return "PT_NOTE";
    case PT_SHLIB:
        return "PT_SHLIB";
    case PT_PHDR:
        return "PT_PHDR";
    case PT_TLS:
        return "PT_TLS";
    case PT_GNU_EH_FRAME:
        return "PT_GNU_EH_FRAME";
    case PT_GNU_STACK:
        return "PT_GNU_STACK";
    case PT_GNU_RELRO:
        return "PT_GNU_RELRO";
    case PT_SUNWBSS:
        return "PT_SUNWBSS";
    case PT_SUNWSTACK:
        return "PT_SUNWSTACK";
    default:
        if (PT_LOOS <= type && type <= PT_HIOS)
        {
            return "Unknown OS-specific";
        }
        if (PT_LOPROC <= type && type <= PT_HIPROC)
        {
            return "Unknown processor-specific";
        }
        return "Unknown";
    }
}

const char *flags_str(ElfW(Word) flags)
{
    switch (flags & (PF_R | PF_W | PF_X))
    {
    case 0 | 0 | 0:
        return "none";
    case 0 | 0 | PF_X:
        return "x";
    case 0 | PF_W | 0:
        return "w";
    case 0 | PF_W | PF_X:
        return "wx";
    case PF_R | 0 | 0:
        return "r";
    case PF_R | 0 | PF_X:
        return "rx";
    case PF_R | PF_W | 0:
        return "rw";
    case PF_R | PF_W | PF_X:
        return "rwx";
    }
    __builtin_unreachable();
}

static int callback(struct dl_phdr_info *info, size_t size, void *data)
{
    int j;
    (void)data;

    printf("object \"%s\"\n", info->dlpi_name);
    printf("  base address: %p\n", (void *)info->dlpi_addr);
    if (size > offsetof(struct dl_phdr_info, dlpi_adds))
    {
        printf("  adds: %lld\n", info->dlpi_adds);
    }
    if (size > offsetof(struct dl_phdr_info, dlpi_subs))
    {
        printf("  subs: %lld\n", info->dlpi_subs);
    }
    if (size > offsetof(struct dl_phdr_info, dlpi_tls_modid))
    {
        printf("  tls modid: %zu\n", info->dlpi_tls_modid);
    }
    if (size > offsetof(struct dl_phdr_info, dlpi_tls_data))
    {
        printf("  tls data: %p\n", info->dlpi_tls_data);
    }
    printf("  segments: %d\n", info->dlpi_phnum);

    for (j = 0; j < info->dlpi_phnum; j++)
    {
        const ElfW(Phdr) *hdr = &info->dlpi_phdr[j];
        printf("    segment %2d\n", j);
        printf("      type: 0x%08X (%s)\n", hdr->p_type, type_str(hdr->p_type));
        printf("      file offset: 0x%08zX\n", hdr->p_offset);
        printf("      virtual addr: %p\n", (void *)hdr->p_vaddr);
        printf("      physical addr: %p\n", (void *)hdr->p_paddr);
        printf("      file size: 0x%08zX\n", hdr->p_filesz);
        printf("      memory size: 0x%08zX\n", hdr->p_memsz);
        printf("      flags: 0x%08X (%s)\n", hdr->p_flags, flags_str(hdr->p_flags));
        printf("      align: %zd\n", hdr->p_align);
        if (hdr->p_memsz)
        {
            printf("      derived address range: %p to %p\n",
                (void *) (info->dlpi_addr + hdr->p_vaddr),
                (void *) (info->dlpi_addr + hdr->p_vaddr + hdr->p_memsz));
        }
    }
    return 0;
}

int main(void)
{
    dl_iterate_phdr(callback, NULL);

    exit(EXIT_SUCCESS);
}

It's incorrect to speak of "the" text segment, since there may be more than one (guaranteed for the usual case when you have shared libraries, but it's still possible for a single ELF binary to have multiple PT_LOAD sections with the same flags anyway).

The following sample program dumps out all the information returned by dl_iterate_phr. You're interested in any segment of type PT_LOAD with the PF_X flag (note that PT_GNU_STACK will include the flag if -z execstack is passed to the linker, so you really do have to check both).

#define _GNU_SOURCE
#include <link.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

const char *type_str(ElfW(Word) type)
{
    switch (type)
    {
    case PT_NULL:
        return "PT_NULL"; // should not be seen at runtime, only in the file!
    case PT_LOAD:
        return "PT_LOAD";
    case PT_DYNAMIC:
        return "PT_DYNAMIC";
    case PT_INTERP:
        return "PT_INTERP";
    case PT_NOTE:
        return "PT_NOTE";
    case PT_SHLIB:
        return "PT_SHLIB";
    case PT_PHDR:
        return "PT_PHDR";
    case PT_TLS:
        return "PT_TLS";
    case PT_GNU_EH_FRAME:
        return "PT_GNU_EH_FRAME";
    case PT_GNU_STACK:
        return "PT_GNU_STACK";
    case PT_GNU_RELRO:
        return "PT_GNU_RELRO";
    case PT_SUNWBSS:
        return "PT_SUNWBSS";
    case PT_SUNWSTACK:
        return "PT_SUNWSTACK";
    default:
        if (PT_LOOS <= type && type <= PT_HIOS)
        {
            return "Unknown OS-specific";
        }
        if (PT_LOPROC <= type && type <= PT_HIPROC)
        {
            return "Unknown processor-specific";
        }
        return "Unknown";
    }
}

const char *flags_str(ElfW(Word) flags)
{
    switch (flags & (PF_R | PF_W | PF_X))
    {
    case 0 | 0 | 0:
        return "none";
    case 0 | 0 | PF_X:
        return "x";
    case 0 | PF_W | 0:
        return "w";
    case 0 | PF_W | PF_X:
        return "wx";
    case PF_R | 0 | 0:
        return "r";
    case PF_R | 0 | PF_X:
        return "rx";
    case PF_R | PF_W | 0:
        return "rw";
    case PF_R | PF_W | PF_X:
        return "rwx";
    }
    __builtin_unreachable();
}

static int callback(struct dl_phdr_info *info, size_t size, void *data)
{
    int j;
    (void)data;

    printf("object \"%s\"\n", info->dlpi_name);
    printf("  base address: %p\n", (void *)info->dlpi_addr);
    if (size > offsetof(struct dl_phdr_info, dlpi_adds))
    {
        printf("  adds: %lld\n", info->dlpi_adds);
    }
    if (size > offsetof(struct dl_phdr_info, dlpi_subs))
    {
        printf("  subs: %lld\n", info->dlpi_subs);
    }
    if (size > offsetof(struct dl_phdr_info, dlpi_tls_modid))
    {
        printf("  tls modid: %zu\n", info->dlpi_tls_modid);
    }
    if (size > offsetof(struct dl_phdr_info, dlpi_tls_data))
    {
        printf("  tls data: %p\n", info->dlpi_tls_data);
    }
    printf("  segments: %d\n", info->dlpi_phnum);

    for (j = 0; j < info->dlpi_phnum; j++)
    {
        const ElfW(Phdr) *hdr = &info->dlpi_phdr[j];
        printf("    segment %2d\n", j);
        printf("      type: 0x%08X (%s)\n", hdr->p_type, type_str(hdr->p_type));
        printf("      file offset: 0x%08zX\n", hdr->p_offset);
        printf("      virtual addr: %p\n", (void *)hdr->p_vaddr);
        printf("      physical addr: %p\n", (void *)hdr->p_paddr);
        printf("      file size: 0x%08zX\n", hdr->p_filesz);
        printf("      memory size: 0x%08zX\n", hdr->p_memsz);
        printf("      flags: 0x%08X (%s)\n", hdr->p_flags, flags_str(hdr->p_flags));
        printf("      align: %zd\n", hdr->p_align);
        if (hdr->p_memsz)
        {
            printf("      derived address range: %p to %p\n",
                (void *) (info->dlpi_addr + hdr->p_vaddr),
                (void *) (info->dlpi_addr + hdr->p_vaddr + hdr->p_memsz));
        }
    }
    return 0;
}

int main(void)
{
    dl_iterate_phdr(callback, NULL);

    exit(EXIT_SUCCESS);
}
看轻我的陪伴 2024-12-10 15:20:22

对于 Linux,请考虑使用 nm(1) 工具来检查目标文件提供的符号。您可以挑选这组符号,从中您可以了解 Matthew Slattery 在其答案中提供的两个符号。

For Linux, consider using nm(1) tool to inspect what symbols the object file provides. You can pick through this set of symbols, where you could learn both of the symbols that Matthew Slattery provided in his answer.

星軌x 2024-12-10 15:20:22

不保证 .rodata 始终直接位于 .text 之后。您可以使用 objdump -h file 和 readelf --sections file 来获取更多信息。使用 objdump,您可以将大小和偏移量获取到文件中。

.rodata is not guaranteed to always come directly after .text. You can use objdump -h file and readelf --sections file to get more info. With objdump you get both size and offset into file.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文