ELF文件格式中sh_addr是否总是等于sh_offset?

发布于 2024-09-06 05:25:23 字数 2347 浏览 2 评论 0原文

最近(是的,不再上学了)我一直在自学 ELF 文件格式。我主要关注这里的文档: http://www.skyfree.org/ linux/references/ELF_Format.pdf

一切都很顺利,我编写了这个程序来为我提供有关 ELF 文件部分的信息:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <elf.h>


void dumpShdrInfo(Elf32_Shdr elfShdr, const char *sectionName)
{
printf("Section '%s' starts at 0x%08X and ends at 0x%08X\n", 
    sectionName, elfShdr.sh_offset, elfShdr.sh_offset + elfShdr.sh_size);
}

int search(const char *name)
{
Elf32_Ehdr elfEhdr;
Elf32_Shdr *elfShdr;
FILE *targetFile;
char tempBuf[64];
int i, ret = -1;

targetFile = fopen(name, "r+b");

if(targetFile)
{
    /* read the ELF header */
    fread(&elfEhdr, sizeof(elfEhdr), 1, targetFile);


    /* Elf32_Ehdr.e_shnum specifies how many sections there are */
    elfShdr = calloc(elfEhdr.e_shnum, sizeof(*elfShdr));
    assert(elfShdr);

    /* set the file pointer to the section header offset and read it */
    fseek(targetFile, elfEhdr.e_shoff, SEEK_SET);
    fread(elfShdr, sizeof(*elfShdr), elfEhdr.e_shnum, targetFile);


    /* loop through every section */
    for(i = 0; (unsigned int)i < elfEhdr.e_shnum; i++)
    {


        /* if Elf32_Shdr.sh_addr isn't 0 the section will appear in memory*/
        if(elfShdr[i].sh_addr)
        {

            /* set the file pointer to the location of the section's name and then read the name */ 
            fseek(targetFile, elfShdr[elfEhdr.e_shstrndx].sh_offset + elfShdr[i].sh_name, SEEK_SET);
            fgets(tempBuf, sizeof(tempBuf), targetFile);

            #if defined(DEBUG)
            dumpShdrInfo(elfShdr[i], tempBuf);
            #endif
        }
    }

    fclose(targetFile);
    free(elfShdr);
}

return ret;
}

int main(int argc, char *argv[])
{
if(argc > 1)
{
    search(argv[1]);
}
return 0;
}

在对几个文件运行几次后,我注意到一些奇怪的事情。 “.text”部分总是从一个非常低的虚拟地址开始(我们所说的小于 1000h)。在使用 gdb 深入研究了一段时间后,我注意到对于每个部分,sh_addr 等于 sh_offset。

这就是我感到困惑的地方 - Elf32_Shdr.sh_addr 被记录为“第一个字节应驻留的地址”,而 Elf32_Shdr.sh_offset 被记录为“从文件开头到第一个字节的字节偏移量”在函数中”。如果都是这种情况,那么对我来说,它们是平等的就没有任何意义了。这是为什么呢?

现在,我知道有些部分包含未初始化的数据(我认为是 .bss),因此该数据不会出现在文件中但会出现在进程的内存中是有道理的。这意味着对于上述部分之后的每个部分,计算出其虚拟地址将比简单变量复杂得多。

话虽这么说,有没有办法真正确定一个部分的虚拟地址?

recently (yay no more school) I've been teaching myself about the ELF file format. I've largely been following the documentation here: http://www.skyfree.org/linux/references/ELF_Format.pdf.

It was all going great, and I wrote this program to give me info about an ELF file's sections:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <elf.h>


void dumpShdrInfo(Elf32_Shdr elfShdr, const char *sectionName)
{
printf("Section '%s' starts at 0x%08X and ends at 0x%08X\n", 
    sectionName, elfShdr.sh_offset, elfShdr.sh_offset + elfShdr.sh_size);
}

int search(const char *name)
{
Elf32_Ehdr elfEhdr;
Elf32_Shdr *elfShdr;
FILE *targetFile;
char tempBuf[64];
int i, ret = -1;

targetFile = fopen(name, "r+b");

if(targetFile)
{
    /* read the ELF header */
    fread(&elfEhdr, sizeof(elfEhdr), 1, targetFile);


    /* Elf32_Ehdr.e_shnum specifies how many sections there are */
    elfShdr = calloc(elfEhdr.e_shnum, sizeof(*elfShdr));
    assert(elfShdr);

    /* set the file pointer to the section header offset and read it */
    fseek(targetFile, elfEhdr.e_shoff, SEEK_SET);
    fread(elfShdr, sizeof(*elfShdr), elfEhdr.e_shnum, targetFile);


    /* loop through every section */
    for(i = 0; (unsigned int)i < elfEhdr.e_shnum; i++)
    {


        /* if Elf32_Shdr.sh_addr isn't 0 the section will appear in memory*/
        if(elfShdr[i].sh_addr)
        {

            /* set the file pointer to the location of the section's name and then read the name */ 
            fseek(targetFile, elfShdr[elfEhdr.e_shstrndx].sh_offset + elfShdr[i].sh_name, SEEK_SET);
            fgets(tempBuf, sizeof(tempBuf), targetFile);

            #if defined(DEBUG)
            dumpShdrInfo(elfShdr[i], tempBuf);
            #endif
        }
    }

    fclose(targetFile);
    free(elfShdr);
}

return ret;
}

int main(int argc, char *argv[])
{
if(argc > 1)
{
    search(argv[1]);
}
return 0;
}

After running it a few times on a couple files I noticed something weird. The '.text' section always began at a very low virtual address (we're talking smaller than 1000h). After digging around with gdb for a while, I noticed that for every section, sh_addr was equal to sh_offset.

This is what I'm confused about - Elf32_Shdr.sh_addr is documented as being "the address at which the first byte should reside", while Elf32_Shdr.sh_offset is documented as being "the byte offset from the beginning of the file to the first byte in the function". If those are both the case, it doesn't really make sense to me that they're both equal. Why is this?

Now, I know there are sections that contain uninitialized data (.bss I think), so it would make sense that that data would not appear in the file but would appear in the process's memory. This would mean that for every section that comes after the aforementioned one, figuring out it's virtual address would be a lot more complicated than a simple variable.

That being said, is there a way of actually determining a section's virtual address?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

明月松间行 2024-09-13 05:25:23

我尝试过,但 Elf32_Shdr.sh_addr 与我的示例中的 Elf32_Shdr.sh_offset 不同。它移位了0x08040000,这是程序在内存中的虚拟起始地址。
对于“.text”部分,Elf32_Shdr.sh_offset 为 0x00000570,对于同一部分,Elf32_Shdr.sh_addr 为 0x08048570。

就像您从文档中引用的那样,Elf32_Shdr.sh_offset 是“从文件开头到函数中第一个字节的字节偏移量”:

gt; hexdump -C -s 0x00000570 -n 64 elffile
00000570  31 ed 5e 89 e1 83 e4 f0  50 54 52 68 b0 88 04 08  |1.^.....PTRh....|
00000580  68 c0 88 04 08 51 56 68  66 88 04 08 e8 3b ff ff  |h....QVhf....;..|
00000590  ff f4 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |................|
000005a0  55 89 e5 83 ec 08 80 3d  44 a0 04 08 00 74 0c eb  |U......=D....t..|

而 Elf32_Shdr.sh_addr 是“第一个字节应驻留的地址”。即数据在内存中的虚拟地址:

(gdb) print/x *(char[64] *) 0x08048570
$4 = {
0x31, 0xed, 0x5e, 0x89, 0xe1, 0x83, 0xe4, 0xf0, 0x50, 0x54, 0x52, 0x68, 0xb0, 0x88, 0x04, 0x08,
0x68, 0xc0, 0x88, 0x04, 0x08, 0x51, 0x56, 0x68, 0x66, 0x88, 0x04, 0x08, 0xe8, 0x3b, 0xff, 0xff,
0xff, 0xf4, 0x90 <repeats 14 times>,
0x55, 0x89, 0xe5, 0x83, 0xec, 0x08, 0x80, 0x3d, 0x44, 0xa0, 0x04, 0x08, 0x00, 0x74, 0x0c, 0xeb}

I tried that and Elf32_Shdr.sh_addr isn't the same as Elf32_Shdr.sh_offset in my example. It is shifted by 0x08040000, which is the virtual start address of the program in memory.
Elf32_Shdr.sh_offset is 0x00000570 for the '.text' section and Elf32_Shdr.sh_addr is 0x08048570 for the same section.

Like you quoted from the documentation Elf32_Shdr.sh_offset is "the byte offset from the beginning of the file to the first byte in the function":

gt; hexdump -C -s 0x00000570 -n 64 elffile
00000570  31 ed 5e 89 e1 83 e4 f0  50 54 52 68 b0 88 04 08  |1.^.....PTRh....|
00000580  68 c0 88 04 08 51 56 68  66 88 04 08 e8 3b ff ff  |h....QVhf....;..|
00000590  ff f4 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |................|
000005a0  55 89 e5 83 ec 08 80 3d  44 a0 04 08 00 74 0c eb  |U......=D....t..|

and Elf32_Shdr.sh_addr is "the address at which the first byte should reside". That is the virtual adress of the data in the memory:

(gdb) print/x *(char[64] *) 0x08048570
$4 = {
0x31, 0xed, 0x5e, 0x89, 0xe1, 0x83, 0xe4, 0xf0, 0x50, 0x54, 0x52, 0x68, 0xb0, 0x88, 0x04, 0x08,
0x68, 0xc0, 0x88, 0x04, 0x08, 0x51, 0x56, 0x68, 0x66, 0x88, 0x04, 0x08, 0xe8, 0x3b, 0xff, 0xff,
0xff, 0xf4, 0x90 <repeats 14 times>,
0x55, 0x89, 0xe5, 0x83, 0xec, 0x08, 0x80, 0x3d, 0x44, 0xa0, 0x04, 0x08, 0x00, 0x74, 0x0c, 0xeb}
别靠近我心 2024-09-13 05:25:23

好吧,在看了 rudi-moore 的答案后,我想我应该再用 gdb 进行调查......

结果在我的 dumpShdrInfo 中我打印的是 sh_offset 而不是 sh_addr。我清楚地记得编写该函数并输入“sh_addr”,以及使用 gdb 进行调试并看到 sh_offset 等于 sh_addr。

然而,我想我是个白痴,我的记忆没有那么值钱,因为一旦我将其更改为 sh_addr 并重新编译它就起作用了。这就是我早上 5 点编程得到的结果。 :/

Okay, after taking a look at rudi-moore's answer I thought I'd investigate with gdb one more time...

It turns out in my dumpShdrInfo I was printing sh_offset instead of sh_addr. I have vivid memories of writing that function and typing out "sh_addr", as well as debugging with gdb and seeing sh_offset being equal to sh_addr.

However, I guess I'm an idiot and my memories aren't worth that much because as soon as I changed it to sh_addr and recompiled it worked. That's what I get for programming at 5AM. :/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文