Linux内核如何管理小于1GB的物理内存?
我正在学习 Linux 内核内部原理,在阅读《Understanding Linux Kernel》时,我遇到了很多与内存相关的问题。其中之一是,如果我的系统上仅安装了 512 MB 的物理内存,Linux 内核如何处理内存映射。
据我所知,内核将0(或16)MB-896MB物理RAM映射到0xC0000000线性地址并可以直接寻址它。因此,在上述情况下,我只有 512 MB:
内核如何从仅有 512 MB 映射 896 MB ?在所描述的方案中,内核进行了设置,以便每个进程的页表将虚拟地址从 0xC0000000 到 0xFFFFFFFF (1GB) 直接映射到物理地址从 0x00000000 到 0x3FFFFFFF (1GB)。但是,当我只有 512 MB 物理 RAM 时,如何将虚拟地址从 0xC0000000-0xFFFFFFFF 映射到物理 0x00000000-0x3FFFFFFF ?问题是我的物理范围只有 0x00000000-0x20000000。
在这种情况下用户模式进程怎么办?
每篇文章都只解释了这种情况,当您安装了 4 GB 内存并且内核将 1 GB 映射到内核空间并且用户进程使用剩余的 RAM 量时。
我将不胜感激任何有助于提高我的理解的帮助。
谢谢..!
I'm learning the linux kernel internals and while reading "Understanding Linux Kernel", quite a few memory related questions struck me. One of them is, how the Linux kernel handles the memory mapping if the physical memory of say only 512 MB is installed on my system.
As I read, kernel maps 0(or 16) MB-896MB physical RAM into 0xC0000000 linear address and can directly address it. So, in the above described case where I only have 512 MB:
How can the kernel map 896 MB from only 512 MB ? In the scheme described, the kernel set things up so that every process's page tables mapped virtual addresses from 0xC0000000 to 0xFFFFFFFF (1GB) directly to physical addresses from 0x00000000 to 0x3FFFFFFF (1GB). But when I have only 512 MB physical RAM, how can I map, virtual addresses from 0xC0000000-0xFFFFFFFF to physical 0x00000000-0x3FFFFFFF ? Point is I have a physical range of only 0x00000000-0x20000000.
What about user mode processes in this situation?
Every article explains only the situation, when you've installed 4 GB of memory and the kernel maps the 1 GB into kernel space and user processes uses the remaining amount of RAM.
I would appreciate any help in improving my understanding.
Thanks..!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
并非所有虚拟(线性)地址都必须映射到任何内容。如果代码访问未映射的页面,则会出现页面错误。
物理页可以同时映射到多个虚拟地址。
在 4 GB 虚拟内存中,有 2 个部分:0x0...0xbffffffff - 是进程虚拟内存,0xc0000000 .. 0xffffffff 是内核虚拟内存。
它映射最大 896 MB。因此,如果您只有 512,则映射的大小只有 512 MB。
如果你的物理内存在0x00000000到0x20000000之间,它将被映射为直接内核访问虚拟地址0xC0000000到0xE0000000(线性映射)。
用户进程的物理内存将被映射(不是顺序地而是随机页到页映射)到虚拟地址 0x0 .... 0xc0000000。此映射将是 0..896MB 页面的第二个映射。这些页面将从空闲页面列表中获取。
任何地方。
每篇文章都解释了如何映射 4 GB 的虚拟地址空间。虚拟内存的大小始终为 4 GB(对于 x86 没有 PAE/PSE/等内存扩展的 32 位机器),
如
8.1.3 中所述。 Robert Love 所著的《Linux Kernel Development》一书的 Memory Zones
(我用的是第三版),物理内存有几个区域:因此,如果您有 512 MB,则 ZONE_HIGHMEM 将为空,而 ZONE_NORMAL 将映射 496 MB 物理内存。
另外,请查看
2.5.5.2。当 RAM 大小小于本书的 896 MB 时的最终内核页表
部分。当您的内存少于 896 MB 时,这与情况有关。另外,对于 ARM,还有一些虚拟内存布局的描述:http://www. mjmwired.net/kernel/Documentation/arm/memory.txt
第63行
PAGE_OFFSET high_memory-1
是内存的直接映射部分Not all virtual (linear) addresses must be mapped to anything. If the code accesses unmapped page, the page fault is risen.
The physical page can be mapped to several virtual addresses simultaneously.
In the 4 GB virtual memory there are 2 sections: 0x0... 0xbfffffff - is process virtual memory and 0xc0000000 .. 0xffffffff is a kernel virtual memory.
It maps up to 896 MB. So, if you have only 512, there will be only 512 MB mapped.
If your physical memory is in 0x00000000 to 0x20000000, it will be mapped for direct kernel access to virtual addresses 0xC0000000 to 0xE0000000 (linear mapping).
Phys memory for user processes will be mapped (not sequentially but rather random page-to-page mapping) to virtual addresses 0x0 .... 0xc0000000. This mapping will be the second mapping for pages from 0..896MB. The pages will be taken from free page lists.
Anywhere.
No. Every article explains how 4 Gb of virtual address space is mapped. The size of virtual memory is always 4 GB (for 32-bit machine without memory extensions like PAE/PSE/etc for x86)
As stated in
8.1.3. Memory Zones
of the bookLinux Kernel Development
by Robert Love (I use third edition), there are several zones of physical memory:So, if you have 512 MB, your ZONE_HIGHMEM will be empty, and ZONE_NORMAL will have 496 MB of physical memory mapped.
Also, take a look to
2.5.5.2. Final kernel Page Table when RAM size is less than 896 MB
section of the book. It is about case, when you have less memory than 896 MB.Also, for ARM there is some description of virtual memory layout: http://www.mjmwired.net/kernel/Documentation/arm/memory.txt
The line 63
PAGE_OFFSET high_memory-1
is the direct mapped part of memory硬件提供内存管理单元。它是一块能够拦截和改变任何内存访问的电路。每当处理器访问 RAM(例如,读取下一条要执行的指令)或作为由指令触发的数据访问时,它都会在某个地址处执行此操作,粗略地说,该地址是一个 32 位值。 32 位字可以有超过 40 亿个不同的值,因此地址空间为 4 GB:这是可以具有唯一地址的字节数。
因此,处理器向其内存子系统发送请求,即“获取地址x处的字节并将其返回给我”。请求通过 MMU,MMU 决定如何处理该请求。 MMU 实际上将 4 GB 空间分割成页面;页面大小取决于您使用的硬件,但典型大小为 4 和 8 kB。 MMU 使用表格告诉它如何处理每个页面的访问:要么使用重写的地址授予访问权限(页面条目说:“是的,包含地址x的页面存在,它是在地址 y 的物理 RAM 中”)或被拒绝,此时将调用内核来进一步处理事务。内核可能决定终止有问题的进程,或者做一些工作并更改 MMU 表,以便可以再次尝试访问,这次成功。
这是虚拟内存的基础:从角度来看,进程有一些RAM,但内核已将其移动到硬盘上的“交换空间”中。相应的表在 MMU 表中被标记为“不存在”。当进程访问其数据时,MMU 调用内核,内核从交换中获取数据,将其放回物理 RAM 中的某个可用空间,并更改 MMU 表以指向该空间。然后内核跳回到进程代码,就在触发整个事件的指令处。除了内存访问花费了相当长的时间之外,流程代码看不到整个业务。
MMU 还处理访问权限,防止进程读取或写入属于其他进程或内核的数据。每个进程都有自己的一组 MMU 表,并且内核管理这些表。因此,每个进程都有自己的地址空间,就好像它单独存在于一台具有 4 GB RAM 的机器上一样——只不过该进程最好不要访问它没有从内核正确分配的内存,因为相应的页面已被标记不存在或被禁止。
当某个进程通过系统调用调用内核时,内核代码必须在该进程的地址空间内运行;因此内核代码必须位于每个进程的地址空间中的某个位置(但受到保护:MMU 表阻止非特权用户代码访问内核内存)。由于代码可以包含硬编码地址,因此所有进程的内核最好位于相同地址;按照惯例,在 Linux 中,该地址是 0xC0000000。每个进程的 MMU 表将地址空间的一部分映射到内核在启动时实际加载的任何物理 RAM 块。请注意,内核内存永远不会被换出(如果可以从交换空间读回数据的代码本身被换出,事情很快就会变得糟糕)。
在 PC 上,情况可能会更复杂一些,因为有 32 位和 64 位模式、段寄存器和 PAE(充当一种具有大页面的二级 MMU)。基本概念保持不变:每个进程都有自己的虚拟 4 GB 地址空间视图,内核使用 MMU 将每个虚拟页映射到 RAM 中适当的物理位置,或者根本不映射到任何位置。
The hardware provides a Memory Management Unit. It is a piece of circuitry which is able to intercept and alter any memory access. Whenever the processor accesses the RAM, e.g. to read the next instruction to execute, or as a data access triggered by an instruction, it does so at some address which is, roughly speaking, a 32-bit value. A 32-bit word can have a bit more than 4 billions distinct values, so there is an address space of 4 GB: that's the number of bytes which could have a unique address.
So the processor sends out the request to its memory subsystem, as "fetch the byte at address x and give it back to me". The request goes through the MMU, which decides what to do with the request. The MMU virtually splits the 4 GB space into pages; page size depends on the hardware you use, but typical sizes are 4 and 8 kB. The MMU uses tables which tell it what to do with accesses for each page: either the access is granted with a rewritten address (the page entry says: "yes, the page containing address x exists, it is in physical RAM at address y") or rejected, at which point the kernel is invoked to handle things further. The kernel may decide to kill the offending process, or to do some work and alter the MMU tables so that the access may be tried again, this time successfully.
This is the basis for virtual memory: from the point of view, the process has some RAM, but the kernel has moved it to the hard disk, in "swap space". The corresponding table is marked as "absent" in the MMU tables. When the process accesses his data, the MMU invokes the kernel, which fetches the data from the swap, puts it back at some free space in physical RAM, and alters the MMU tables to point at that space. The kernel then jumps back to the process code, right at the instruction which triggered the whole thing. The process code sees nothing of the whole business, except that the memory access took quite some time.
The MMU also handles access rights, which prevents a process from reading or writing data which belongs to other processes, or to the kernel. Each process has its own set of MMU tables, and the kernel manage those tables. Thus, each process has its own address space, as if it was alone on a machine with 4 GB of RAM -- except that the process had better not access memory that it did not allocate rightfully from the kernel, because the corresponding pages are marked as absent or forbidden.
When the kernel is invoked through a system call from some process, the kernel code must run within the address space of the process; so the kernel code must be somewhere in the address space of each process (but protected: the MMU tables prevent access to the kernel memory from unprivileged user code). Since code can contain hardcoded addresses, the kernel had better be at the same address for all processes; conventionally, in Linux, that address is 0xC0000000. The MMU tables for each process map that part of the address space to whatever physical RAM blocks the kernel was actually loaded upon boot. Note that the kernel memory is never swapped out (if the code which can read back data from swap space was itself swapped out, things would turn sour quite fast).
On a PC, things can be a bit more complicated, because there are 32-bit and 64-bit modes, and segment registers, and PAE (which acts as a kind of second-level MMU with huge pages). The basic concept remains the same: each process gets its own view of a virtual 4 GB address space, and the kernel uses the MMU to map each virtual page to an appropriate physical position in RAM, or nowhere at all.
osgx 有一个很好的答案,但我看到有人仍然不明白的评论。
这里有很多令人困惑的地方。有虚拟内存和物理内存。每个 32 位 CPU 都有 4GB 的虚拟内存。 Linux 内核的传统划分是用户内存和内核内存 3G/1G,但较新的选项允许不同的分区。
当任务交换时,MMU 必须更新。所有进程的内核 MMU 空间应保持相同。内核必须随时处理中断和故障请求。
虚拟内存有多种排列。
从上面的列表中,很容易看出为什么您可能拥有比物理内存更多的虚拟地址空间。事实上,故障处理程序通常会检查进程内存信息,以查看页面是否映射(我的意思是为进程分配),但不在内存中< /em>.在这种情况下,故障处理程序将调用 I/O 子系统来读入页面。当页面被读取并且 MMU 表被更新以将虚拟地址指向新的物理地址时,导致故障的进程将恢复。
如果您理解上述内容,就会清楚为什么您希望拥有比物理内存更大的虚拟映射。这就是支持内存交换的方式。
还有其他用途。例如,两个进程可能使用相同的代码库。由于链接,它们可能位于进程空间中的不同虚拟地址。在这种情况下,您可以将不同的虚拟地址映射到同一物理页,以节省物理内存。这对于新分配来说很常见;它们都指向物理“零页”。当您触摸/写入内存时,将复制零页并分配新的物理页(COW 或写入时复制)。
有时,将虚拟页面别名为缓存,另一个别名为非缓存也很有用。可以检查这两个页面以查看哪些数据被缓存,哪些数据没有被缓存。
主要是虚拟和物理不一样!说起来很简单,但在查看 Linux VMM 代码时常常会感到困惑。
osgx has an excellent answer, but I see a comment where someone still doesn't understand.
Here is much of the confusion. There is virtual memory and there is physical memory. Every 32bit CPU has 4GB of virtual memory. The Linux kernel's traditional split was 3G/1G for user memory and kernel memory, but newer options allow different partitioning.
When a task swaps, the MMU must be updated. The kernel MMU space should remain the same for all processes. The kernel must handle interrupts and fault requests at any time.
There are many permutations of virtual memory.
From the above list, it is easy to see why you may have more virtual address space than physical memory. In fact, the fault handler will typically inspect process memory information to see if a page is mapped (I mean allocated for the process), but not in memory. In this case the fault handler will call the I/O sub-system to read in the page. When the page has been read and the MMU tables updated to point the virtual address to a new physical address, the process that caused the fault resumes.
If you understand the above, it becomes clear why you would like to have a larger virtual mapping than physical memory. It is how memory swapping is supported.
There are other uses. For instance two processes may use the same code library. It is possible that they are at different virtual addresses in the process space due to linking. You may map the different virtual addresses to the same physical page in this case in order to save physical memory. This is quite common for new allocations; they all point to a physical 'zero page'. When you touch/write the memory the zero page is copied and a new physical page allocated (COW or copy on write).
It is also sometimes useful to have the virtual pages aliased with one as cached and another as non-cached. The two pages can be examined to see what data is cached and what is not.
Mainly virtual and physical are not the same! Easily stated, but often confusing when looking at the Linux VMM code.
-
嗨,实际上,我不在 x86 硬件平台上工作,因此我的帖子中可能存在一些技术错误。
据我所知,当您的 RAM 超过该数字时,特别列出了 0(或 16)MB - 896MB 之间的范围,例如,您的主板上有 1GB 物理 RAM,这称为“低内存”。如果主板上的物理 RAM 超过 896MB,则其余物理 RAM 称为 highmem。
说到你的问题,你的主板上有512MiBytes物理RAM,所以实际上,没有896,没有highmem。
内核可以看到和映射的总 RAM 是 512MB。
因为物理内存和内核虚拟地址之间存在一一映射,因此内核有 512MiBytes 虚拟地址空间。我真的不确定前面这句话是否正确,但我心里就是这么想的。
我的意思是,如果有 512MBytes,那么内核可以管理的物理 RAM 量也是 512MiBytes,此外,内核无法创建超过 512MBytes 的大地址空间。
参考用户空间,有一点不同,用户应用程序的页面可以换出到硬盘,但内核的页面不能。
所以,对于用户空间来说,借助页表等相关模块,看起来还有4GBytes的地址空间。
当然,这是虚拟地址空间,而不是物理RAM空间。
这是我的理解。
谢谢。
-
Hi, actually, I don't work on x86 hardware platform, so there may exist some technical errors in my post.
To my knowledge, the range between 0(or 16)MB - 896MB is listed specially while you have more RAM than that number, say, you have 1GB physical RAM on your board, which is called "low-memory". If you have more physical RAM than 896MB on your board, then, rest of the physical RAM is called highmem.
Speaking of your question, there are 512MiBytes physical RAM on your board, so actually, there is no 896, no highmem.
The total RAM kernel can see and also can map is 512MB.
'Cause there is 1-to-1 mapping between physical memory and kernel virtual address, so there is 512MiBytes virtual address space for kernel. I'm really not sure whether or not the prior sentence is right, but it's what in my mind.
What I mean is if there is 512MBytes, then the amount of physical RAM the kernel can manage is also 512MiBytes, further, the kernel cannot create such big address space like beyond 512MBytes.
Refer to user space, there is one different point, pages of user's application can be swapped out to harddisk, but pages of the kernel cannot.
So, for user space, with the help of page tables and other related modules, it seems there is still 4GBytes address space.
Of course, this is virtual address space, not physical RAM space.
This is what I understand.
Thanks.
如果物理内存小于 896 MB,那么 Linux 内核会线性映射到该物理地址。
有关详细信息,请参阅此.. http://learnlinuxconcepts.blogspot.in/2014/02 /linux-addressing.html
If the physical memory is less than 896 MB then the linux kernel maps upto that physical address lineraly.
For details see this.. http://learnlinuxconcepts.blogspot.in/2014/02/linux-addressing.html