sbrk/brk在Linux中是如何实现的?
我正在考虑 Linux 内核如何实现系统调用,我想知道是否有人可以给我一个关于 sbrk/brk 如何工作的高级视图?
我已经查看了内核代码,但是代码太多,我看不懂。 我希望得到某人的总结?
I was thinking about how the Linux kernel implements system calls and I was wondering if someone could give me a high level view of how sbrk/brk work?
I've reviewed the kernel code, but there is just so much of it and I don't understand it. I was hoping for a summary from someone?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在非常高层次的视图中,Linux 内核将进程可见的内存跟踪为多个“内存区域”(
struct vm_area_struct
)。 还有一个结构(再次在非常高的级别视图中)表示进程的整个地址空间(struct mm_struct
)。 每个进程(除了一些内核线程)都只有一个 struct mm_struct ,它又指向它可以访问的内存的所有 struct vm_area_struct 。sys_brk
系统调用(位于mm/mmap.c
中)只是调整其中一些内存区域。 (sbrk
是brk
的 glibc 包装器)。 它通过比较brk
地址的旧值(在struct mm_struct
中找到)和请求的值来实现这一点。首先查看
mmap
系列函数会更简单,因为brk
是它的一个特例。In a very high level view, the Linux kernel tracks the memory visible to a process as several "memory areas" (
struct vm_area_struct
). There is also a structure which represents (again in a very high level view) a process' whole address space (struct mm_struct
). Each process (except some kernel threads) has exactly onestruct mm_struct
, which in turn points to all thestruct vm_area_struct
for the memory it can accesss.The
sys_brk
system call (found inmm/mmap.c
) simply adjusts some of these memory areas. (sbrk
is a glibc wrapper aroundbrk
). It does so by comparing the old value of thebrk
address (found insidestruct mm_struct
) and the requested value.It would be simpler to look at the
mmap
family of functions first, sincebrk
is a special case of it.您必须了解虚拟内存的工作原理,以及 MMU 映射与实际 RAM 的关系。
真实 RAM 被划分为多个页,传统上每个页 4kB。 每个进程都有自己的 MMU 映射,为该进程提供一个线性内存空间(32 位 Linux 中为 4GB)。 当然,实际上并不是所有的都被分配了。 起初,它几乎是空的,即没有真正的页面与大多数地址关联。
当进程到达未分配的地址(尝试读取、写入或执行它)时,MMU 会生成错误(类似于中断),并调用 VM 系统。 如果它决定应该有一些 RAM,它会选择一个未使用的 RAM 页面并与该地址范围关联。
这样,内核不关心进程如何使用内存,进程也不关心有多少 RAM,它将始终具有相同的线性 4GB 地址空间。
现在,
brk/sbrk
在稍高的级别上工作:原则上,任何“超出”该标记的内存地址都是无效的,并且在访问时不会获得 RAM 页,进程将被杀死。 用户空间库管理此限制内的内存分配,并且仅在需要时要求内核增加它。但即使进程通过将
brk
设置为允许的最大值来启动,它也不会获得分配的实际 RAM 页,直到它开始访问所有内存地址。you have to understand how virtual memory works, and how an MMU mapping relates to real RAM.
real RAM is divided in pages, traditionally 4kB each. each process has its own MMU mapping, which presents to that process a linear memory space (4GB in 32-bit linux). of course, not all of them is actually allocated. at first, it's almost empty, that is no real page is associated with most addresses.
when the process hits a non-allocated address (either trying to read, write or execute it), the MMU generates a fault (similar to an interrupt), and the VM system is invoked. If it decides that some RAM should be there, it picks an unused RAM page and associates with that address range.
that way, the kernel doesn't care how the process uses memory, and the process doesn't really care how much RAM there is, it will always have the same linear 4GB of address space.
now, the
brk/sbrk
work at a slightly higher level: in principle any memory address 'beyond' that mark is invalid and won't get a RAM page if accessed, the process would be killed instead. the userspace library manages memory allocations within this limit, and only when needed ask the kernel to increase it.But even if a process started by setting
brk
to the maximum allowed, it wouldn't get real RAM pages allocated until it starts accessing all that memory addresses.那么,从超高层的角度来看,内核分配一个可分页的内存块,修改请求该块的进程的页表,以便将内存映射到进程的VA空间,然后返回地址。
Well, from a super-high level perspective, the kernel allocates a pageable block of memory, modifies the page tables of the process requesting that block so that the memory is mapped into the process's VA space, then returns the address.
Linux 内核如何将内存传递给用户进程的一个关键概念是进程可用堆(数据段)从底部向上增长。 内核不跟踪单个内存块,只跟踪连续的内存块。 brk/sbrk 系统调用扩展了进程拥有的内存量,但由进程以可用的部分来管理它。
这样做的一个关键后果是,分散在进程地址空间中未使用的内存无法返回到操作系统以供其他使用。 只有位于数据段最末端的内存才能返回给操作系统,因此接近末端的正在使用的内存必须向下移至顶部。 实际上,几乎没有分配器这样做。 因此,管理好进程使用的最大内存量通常很重要,因为这决定了为其他进程留下多少内存。
A key concept of how the linux kernel passes memory to a user process is that the processes available heap (the data segment) grows up from the bottom. the kernel does not keep track of individual chunks of memory, only a continuous block of memory. the brk/sbrk system calls expand the amount of memory the process has, but it's up to the process to manage it in usable pieces.
A key consequence of this is that memory scattered across the processes address space that is not in use cannot be returned to the operating system for other uses. Only memory at the very end of the data segment can be returned to the operating system, so in-use memory near the end would have to be shifted downward toward the top. In practice almost no allocators do this. For this reason, it's usually important to do a good job of managing the maximum amount of memory a process uses, because that determines how much memory will be left for other processes.