mmap() 内部结构

发布于 2024-07-17 14:17:17 字数 152 浏览 9 评论 0原文

众所周知,mmap() 最重要的功能是文件映射在许多进程之间共享。 但众所周知,每个进程都有自己的地址空间。

问题是内存映射文件(更具体地说,它的数据)真正保存在哪里,以及进程如何访问该内存? 我的意思不是 *(pa+i) 和其他高级内容,而是过程的内部结构。

It's widely known that the most significant mmap() feature is that file mapping is shared between many processes. But it's not less widely known that every process has its own address space.

The question is where are memmapped files (more specifically, its data) truly kept, and how processes can get access to this memory?
I mean not *(pa+i) and other high-level stuff, but I mean the internals of the process.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

[浮城] 2024-07-24 14:17:17

这发生在操作系统中的虚拟内存管理层。 当您内存映射文件时,内存管理器基本上将该文件视为进程的交换空间。 当您访问虚拟内存地址空间中的页面时,内存映射器必须解释它们并将它们映射到物理内存。 当跨越页面边界时,可能会导致页面错误,此时操作系统必须将一块磁盘空间映射到一块物理内存并解析内存映射。 使用 mmap,它只是从您的文件而不是它自己的交换空间中执行此操作。

如果您想了解如何发生这种情况的大量详细信息,您必须告诉我们您正在使用哪个操作系统,因为实现细节各不相同。

This happens at the virtual memory management layer in the operating system. When you memory map a file, the memory manager basically treats the file as if it were swap space for the process. As you access pages in your virtual memory address space, the memory mapper has to interpret them and map them to physical memory. When you cross a page boundary, this may cause a page fault, at which time the OS must map a chunk of disk space to a chunk of physical memory and resolve the memory mapping. With mmap, it simply does so from your file instead of its own swap space.

If you want lots of details of how this happens, you'll have to tell us which operating system you're using, as implementation details vary.

风月客 2024-07-24 14:17:17

这非常依赖于实现,但以下是一种可能的实现:

当文件第一次被内存映射时,数据首先不存储在任何地方,它仍然在磁盘上。 虚拟内存管理器 (VMM) 为文件的进程分配一系列虚拟内存地址,但这些地址不会立即添加到页表中。

当程序第一次尝试读取或写入其中一个地址时,会发生页面错误。 操作系统捕获页面错误,找出该地址对应于内存映射文件,并将适当的磁盘扇区读入内部内核缓冲区。 然后,它将内核缓冲区映射到进程的地址空间,并重新启动导致页面错误的用户指令。 如果错误指令是读取,那么我们现在就完成了。 如果是写入,则数据将写入内存,并将该页标记为脏页。 后续读取或写入同一页内的数据不需要从磁盘读取/写入,因为数据位于内存中。

当文件被刷新或关闭时,任何已标记为脏的页面都将写回磁盘。

使用内存映射文件对于以非常随意的方式读取或写入磁盘扇区的程序来说是有利的。 您只读取实际使用的磁盘扇区,而不是读取整个文件。

This is very implementation-dependent, but the following is one possible implementation:

When a file is a first memory-mapped, the data isn't stored anywhere at first, it's still on disk. The virtual memory manager (VMM) allocates a range of virtual memory addresses to the process for the file, but those addresses aren't immediately added to the page table.

When the program first tries to read or write to one of those addresses, a page fault occurs. The OS catches the page fault, figures out that that address corresponds to a memory-mapped file, and reads the appropriate disk sector into an internal kernel buffer. Then, it maps the kernel buffer into the process's address space, and restarts the user instruction that caused the page fault. If the faulting instruction was a read, we're all done for now. If it was a write, the data is written to memory, and the page is marked as dirty. Subsequent reads or writes to data within the same page do not require reading/writing to/from disk, since the data is in memory.

When the file is flushed or closed, any pages which have been marked dirty are written back to disk.

Using memory-mapped files is advantageous for programs which read or write disk sectors in a very haphazard manner. You only read disk sectors which are actually used, instead of reading the entire file.

烟酒忠诚 2024-07-24 14:17:17

我不太确定你在问什么,但是 mmap() 预留了一块虚拟内存来保存给定数量的数据(通常。有时它可以是文件支持的)。

进程是操作系统实体,它通过操作系统禁止的方法访问内存映射区域:调用 mmap()。

I'm not really sure what you are asking, but mmap() sets aside a chunk of virtual memory to hold the given amount of data (usually. It can be file-backed sometimes).

A process is an OS entity, and it gains access to memory mapped areas through the OS-proscribed method: calling mmap().

合久必婚 2024-07-24 14:17:17

内核有代表内存块的内部缓冲区。 任何给定进程都在其自己的地址空间中分配一个内存映射,该地址空间引用该缓冲区。 许多进程可能有自己的映射,但它们最终都解析为相同的块(通过内核缓冲区)。

这是一个足够简单的概念,但是当进程写入时它可能会变得有点棘手。 为了在只读情况下保持简单,通常会提供仅在需要时使用的写时复制功能。

The kernel has internal buffers representing chunks of memory. Any given process is assigned a memory mapping in its own address space which refers to that buffer. A number of proccesses may have their own mappings, but they all end up resolving to the same chunk (via the kernel buffer).

This is a simple enough concept, but it can get a little tricky when processes write. To keep things simple in the read-only case there's usually a copy-on-write functionality that's only used as needed.

素染倾城色 2024-07-24 14:17:17

任何数据都会以某种形式存在于内存或其他形式中,某些情况下是HDD,在嵌入式系统中可能是一些闪存甚至是RAM(initramfs),除了最后一个,内存中的数据经常缓存在RAM中, RAM 在逻辑上分为页面,内核维护唯一标识页面的描述符列表。

因此,访问数据充其量就是访问物理页。 进程获得自己的进程地址空间,该空间由许多 vm_are_struct 组成,它标识地址空间中的映射部分。 在对 mmap 的调用中,可能会创建新的 vm_area_struct,或者如果地址相邻,则可能会与现有的 vm_area_struct 合并。

一个新的虚拟地址将返回到对 mmap 的调用。 还会创建新的页表,其中包括将新创建的虚拟地址映射到实际数据所在的物理地址。 映射可以在文件上完成,也可以像 malloc 一样匿名完成。 进程地址空间结构mm_struct使用pgd_t(页全局目录)的指针来到达物理页并访问数据。

Any data will be in some form of memory or others, some cases in HDD, in embedded systems may be some flash memory or even the ram (initramfs), barring the last one, the data in the memory are frequently cached in the RAM, RAM is logical divided into pages and the kernel maintains a list of descriptors which uniquely identify an page.

So at best accessing data would be accessing the physical pages. Process gets there own process address space which consists of many vm_are_struct which identifies a mapped section in the address space. In a call to mmap, new vm_area_struct may be created or may be merged with an existing one if the addresses are adjacent.

A new virtual address is returned to the call to mmap. Also new page tables are created which consists the mapping of the newly created virtual addresses to the physical address where the real data resides. mapping can be done on a file, or anonymously like malloc. The process address space structure mm_struct uses the pointer of pgd_t (Page global directory) to reach the physical page and access the data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文