有关可执行程序(进程)的内存布局的更多信息

发布于 2024-08-16 03:41:35 字数 1851 浏览 1 评论 0原文

我参加了三星的面试。他们问了很多关于程序内存布局的问题。我对此几乎一无所知。

我用谷歌搜索“可执行程序的内存布局”。 “进程的内存布局”。

我很惊讶地发现关于这些主题的信息很少。大多数结果是论坛查询。我只是想知道为什么?

这些是我找到的几个链接:

  1. 运行时存储组织
  2. 运行时内存组织
  3. C 进程的内存布局 ^pdf^

我想从一本合适的书而不是一些网络链接中学习这一点(兰迪·海德的也是一本书,但是是其他书)。在哪本书里可以找到清晰的&有关此主题的更多信息?

我还想知道,为什么操作系统书籍在他们的书中没有涵盖这一点?我读了《摊位》第六版。它仅讨论进程控制块。

布局的整个创建是链接器的任务,对吗?我在哪里可以阅读有关此过程的更多信息。我想要从磁盘上的程序到其在处理器上的执行的完整信息。

编辑:

最初,即使在阅读了下面给出的答案后我也不清楚。最近读到这些文章,我明白了一些事情。

帮助我理解的资源:

  1. www.tenouk.com/Bufferoverflowc/Bufferoverflow1b.html
  2. 5部分PE文件格式教程:http://win32 assembly.online.fr/tutorials.html
  3. 优秀文章: http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html
  4. PE Explorer:http://www.heaventools.com/

是的,“可执行程序的布局(PE/ELF)”!=“进程的内存布局”)。请在第三个链接中自行查找。 :)

弄清楚我的概念后,我的问题让我看起来很愚蠢。 :)

I attended interview for samsung. They asked lot of questions on memory layout of the program. I barely know anything about this.

I googled it "Memory layout of an executable program". "Memory layout of process".

I'm surprised to see that there isn't much info on these topics. Most of the results are forum queries. I just wonder why?

These are the few links I found:

  1. Run-Time Storage Organization
  2. Run-Time Memory Organization
  3. Memory layout of C process ^pdf^

I want to learn this from a proper book instead of some web links.(Randy Hyde's is also a book but some other book). In which book can I find clear & more information on this subject?

I also wonder, why didn't the operating systems book cover this in their books? I read stallings 6th edition. It just discusses the Process Control Block.

This entire creation of layout is task of linker right? Where can I read more about this process. I want COMPLETE info from a program on the disk to its execution on the processor.

EDIT:

Initially, I was not clear even after reading the answers given below. Recently, I came across these articles after reading them, I understood things clearly.

Resources that helped me in understanding:

  1. www.tenouk.com/Bufferoverflowc/Bufferoverflow1b.html
  2. 5 part PE file format tutorial: http://win32assembly.online.fr/tutorials.html
  3. Excellent article : http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html
  4. PE Explorer: http://www.heaventools.com/

Yes, "layout of an executable program(PE/ELF)" != "Memory layout of process"). Findout for yourself in the 3rd link. :)

After clearing my concepts, my questions are making me look so stupid. :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

2024-08-23 03:41:35

如何加载内容在很大程度上取决于操作系统和所使用的二进制格式,并且细节可能会变得令人讨厌。二进制文件的布局有一些标准,但这实际上取决于操作系统如何布局进程的内存。这可能就是文档很难找到的原因。

回答您的问题:

  1. 书籍:
    • 如果您对进程如何布局内存感兴趣,请参阅了解 Linux 内核。第3章讨论进程描述符、创建进程和销毁进程。
    • 据我所知,唯一一本详细介绍链接和加载的书是链接器和加载器,作者:约翰·莱文.有在线版本和印刷版本,所以请检查一下。

  2. 可执行代码是由编译器和链接器创建的,但链接器将内容以操作系统所需的二进制格式放入。在 Linux 上,此格式通常为 ELF,在 Windows 和较旧的 Unix 上,它为 COFF,在 Mac OS X 上它是 马赫-O。但这不是一个固定的列表。某些操作系统可以并且确实支持多种二进制格式。链接器需要知道输出格式才能创建可执行文件。

  3. 该进程的内存布局与二进制格式非常相似,因为许多二进制格式被设计为 mmap'd 以便加载器的任务更容易。

    事情并不是完全那么简单。二进制格式的某些部分(如静态数据)不直接存储在二进制文件中。相反,二进制文件仅包含这些部分的大小。当进程加载到内存中时,加载程序知道分配正确的内存量,但二进制文件不需要包含大的空部分。

    此外,进程的内存布局还包括一些用于堆栈,进程的调用帧和动态分配的内存所在的位置。它们通常位于大地址空间的两端。

这实际上只是触及了二进制文件如何加载的表面,并且没有涵盖有关动态库的任何内容。要真正详细地了解动态链接和加载的工作原理,请阅读如何编写共享库

How things are loaded depends very strongly on the OS and on the binary format used, and the details can get nasty. There are standards for how binary files are laid out, but it's really up to the OS how a process's memory is laid out. This is probably why the documentation is hard to find.

To answer your questions:

  1. Books:
    • If you're interested in how processes lay out their memory, look at Understanding the Linux Kernel. Chapter 3 talks about process descriptors, creating processes, and destroying processes.
    • The only book I know of that covers linking and loading in any detail is Linkers and Loaders by John Levine. There's an online and a print version, so check that out.
  2. Executable code is created by the compiler and the linker, but it's the linker that puts things in the binary format the OS needs. On Linux, this format is typically ELF, on Windows and older Unixes it's COFF, and on Mac OS X it's Mach-O. This isn't a fixed list, though. Some OS's can and do support multiple binary formats. Linkers need to know the output format to create executable files.

  3. The process's memory layout is pretty similar to the binary format, because a lot of binary formats are designed to be mmap'd so that the loader's task is easier.

    It's not quite that simple though. Some parts of the binary format (like static data) are not stored directly in the binary file. Instead, the binary just contains the size of these sections. When the process is loaded into memory, the loader knows to allocate the right amount of memory, but the binary file doesn't need to contain large empty sections.

    Also, the process's memory layout includes some space for the stack and the heap, where a process's call frames and dynamically allocated memory go. These generally live at opposite ends of a large address space.

This really just scratches the surface of how binaries get loaded, and it doesn't cover anything about dynamic libraries. For a really detailed treatment of how dynamic linking and loading work, read How to Write Shared Libraries.

噩梦成真你也成魔 2024-08-23 03:41:35

这是从文件 (*nix) 执行程序的一种方法。

  • 进程被创建(例如fork())。这为新进程提供了自己的内存映射。这包括内存某些区域中的堆栈(通常位于内存的较高位置)。
  • 新进程调用 exec() 用新的可执行文件替换当前的可执行文件(通常是 shell)。通常,新的可执行文件.text(可执行代码和常量)和.data(读/写初始化变量)是为需求页映射而设置的,也就是说,它们根据需要映射到进程内存空间。通常,.text 部分首先出现,然后是 .data。 .bss 部分(未初始化的变量)通常分配在 .data 部分之后。很多时候,当第一次访问包含 bss 变量的页面时,它会被映射为返回零页面。堆通常从 .bss 部分之后的下一页边界开始。然后堆在内存中增长,而堆栈则向下增长(记住我通常说过,也有例外!)。

如果堆和堆栈发生冲突,通常会导致内存不足的情况,这就是堆栈通常放置在高内存中的原因。

在没有内存管理单元的系统中,请求调页通常不可用,但经常使用相同的内存布局。

Here is one way a program can be executed from a file (*nix).

  • The process is created (e.g. fork()). This gives the new process its own memory map. This includes a stack in some area of memory (usually high up in memory somewhere).
  • The new process calls exec() to replace the current executable (often a shell) with the new executable. Often, the new executables .text (executable code and constants) and .data (r/w initialized variables) are set up for demand page mapping, that is, they are mapped into the process memory space as needed. Often, the .text section comes first, followed by .data. The .bss section (uninitialized variables) is often allocated after the .data section. Many times it is mapped to return a page of zeros when the page containing a bss variable is first accessed. The heap often starts at the next page boundary after the .bss section. The heap then grows up in memory while the stack grows down (remember I said usually, there are exceptions!).

If the heap and stack collide, that often causes an out of memory situation, which is why the stack is often placed in high memory.

In a system without a memory management unit, demand paging is usually unavailable but the same memory layout is often used.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文