什么是分页?
此处解释了分页,幻灯片#6:
http://www.cs.ucc .ie/~grigoras/CS2506/Lecture_6.pdf
在我的讲义中,但我一生都无法理解它。我知道这是一种将虚拟地址转换为物理地址的方法。因此,磁盘上的虚拟地址被分为 2^k 的块。这之后我真的很困惑。有人可以简单地向我解释一下吗?
Paging is explained here, slide #6 :
http://www.cs.ucc.ie/~grigoras/CS2506/Lecture_6.pdf
in my lecture notes, but I cannot for the life of me understand it. I know its a way of translating virtual addresses to physical addresses. So the virtual addresses, which are on disks are divided into chunks of 2^k. I am really confused after this. Can someone please explain it to me in simple terms?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
正如您所指出的,分页是一种虚拟内存。回答@John Curtsy提出的问题:它通常与虚拟内存分开讨论,因为还有其他类型的虚拟内存,尽管分页现在(到目前为止)是最常见的。
分页虚拟内存非常简单:将所有物理内存分成块,大部分大小相同(尽管在实践中选择两种或三种大小是相当常见的)。使块大小相等使它们可以互换。
然后你就可以寻址了。首先将每个地址分成两部分。一种是页面内的偏移量。通常,您对该部分使用最低有效位。如果您使用(例如)4K 页,则需要 12 位作为偏移量。对于(例如)32 位地址空间,还剩下 20 位。
从那时起,事情确实比最初看起来简单得多。您基本上构建一个小的“描述符”来描述内存的每个页面。这将具有一个线性地址(客户端应用程序用于寻址该内存的地址)、该内存的物理地址以及一个当前位。将会(至少通常)有一些其他的东西,比如权限来指示该页面中的数据是否可以读取、写入、执行等。
然后,当客户端代码使用地址时,CPU 首先将页面偏移量分解为地址的其余部分。然后,它获取线性地址的其余部分,并查看页面描述符以查找与该线性地址对应的物理地址。然后,为了对物理内存进行寻址,它使用物理地址的高20位和线性地址的低12位,它们一起形成实际的物理地址,该地址在处理器引脚上发出并从内存芯片获取数据。
现在,我们进入获得“真正的”虚拟内存的部分。当程序使用的内存多于实际可用的内存时,操作系统会获取其中一些描述符的数据,并将其写入磁盘驱动器。然后它清除该内存页的“当前”位。内存的物理页现在可以用于其他目的。
当客户端程序尝试引用该内存时,CPU 会检查 Present 位是否已设置。如果不是,CPU 将引发异常。当发生这种情况时,CPU 如上所述释放一块物理内存,从磁盘读回当前页面的数据,并用它现在所在的物理页面的地址填充页面描述符。完成所有这些后,它从异常中返回,CPU 重新开始执行导致异常的指令 - 但现在,Present 位已设置,因此可以使用内存。
您可能还需要了解一个更多细节:页面描述符通常被安排到页表中,并且(重要的部分)您通常为系统中的每个进程拥有一组单独的页表(以及另一组用于操作系统内核的页表)本身)。每个进程拥有单独的页表意味着每个进程可以使用相同的线性地址集,但这些地址会根据需要映射到不同的物理地址集。您还可以通过创建两个包含相同物理地址的单独页面描述符(每个进程一个)来将相同的物理内存映射到多个进程。大多数操作系统都使用此功能,因此,例如,如果您运行同一程序的两个或三个副本,那么内存中实际上只会有该程序的可执行代码的一个副本 - 但是它将有两到三组页面描述符指向相同的代码,因此所有页面描述符都可以使用它,而无需为每个页面描述符制作单独的副本。
当然,我简化了很多——已经写了很多关于虚拟内存的完整的(而且通常相当大的)书籍。机器之间也存在相当多的差异,添加了各种装饰,对参数进行了微小的更改(例如,页面是 4K 还是 8K)等等。尽管如此,这至少是所发生事情的核心的一般概念(并且它仍然处于足够高的水平,可以同样应用于 ARM、x86、MIPS、SPARC 等
。
) 注意:我还没有完整阅读所有这些内容,因此我不能真正保证它们一定是好书。]
一般参考文献
操作系统参考
CPU 参考文献
[注意:这些链接可能会过时。抱歉,但对此没有太大帮助。]
英特尔® 64 和 IA-32架构软件开发人员手册第 3A 卷:系统编程指南,第 1 部分
<一href="https://developer.arm.com/documentation/ddi0406/cb/System-Level-Architecture/Virtual-Memory-System-Architecture--VMSA-" rel="nofollow noreferrer">ARM 架构参考手册,虚拟内存系统架构
RISC V 特权规范
Paging is, as you've noted, a type of virtual memory. To answer the question raised by @John Curtsy: it's covered separately from virtual memory in general because there are other types of virtual memory, although paging is now (by far) the most common.
Paged virtual memory is pretty simple: you split all of your physical memory up into blocks, mostly of equal size (though having a selection of two or three sizes is fairly common in practice). Making the blocks equal sized makes them interchangeable.
Then you have addressing. You start by breaking each address up into two pieces. One is an offset within a page. You normally use the least significant bits for that part. If you use (say) 4K pages, you need 12 bits for the offset. With (say) a 32-bit address space, that leaves 20 more bits.
From there, things are really a lot simpler than they initially seem. You basically build a small "descriptor" to describe each page of memory. This will have a linear address (the address used by the client application to address that memory), and a physical address for the memory, as well as a Present bit. There will (at least usually) be a few other things like permissions to indicate whether data in that page can be read, written, executed, etc.
Then, when client code uses an address, the CPU starts by breaking up the page offset from the rest of the address. It then takes the rest of the linear address, and looks through the page descriptors to find the physical address that goes with that linear address. Then, to address the physical memory, it uses the upper 20 bits of the physical address with the lower 12 bits of the linear address, and together they form the actual physical address that goes out on the processor pins and gets data from the memory chip.
Now, we get to the part where we get "true" virtual memory. When programs are using more memory than is actually available, the OS takes the data for some of those descriptors, and writes it out to the disk drive. It then clears the "Present" bit for that page of memory. The physical page of memory is now free for some other purpose.
When the client program tries to refer to that memory, the CPU checks that the Present bit is set. If it's not, the CPU raises an exception. When that happens, the CPU frees up a block of physical memory as above, reads the data for the current page back in from disk, and fills in the page descriptor with the address of the physical page where it's now located. When it's done all that, it returns from the exception, and the CPU restarts execution of the instruction that caused the exception to start with -- except now, the Present bit is set, so using the memory will work.
There is one more detail that you probably need to know: the page descriptors are normally arranged into page tables, and (the important part) you normally have a separate set of page tables for each process in the system (and another for the OS kernel itself). Having separate page tables for each process means that each process can use the same set of linear addresses, but those get mapped to different set of physical addresses as needed. You can also map the same physical memory to more than one process by just creating two separate page descriptors (one for each process) that contain the same physical address. Most OSes use this so that, for example, if you have two or three copies of the same program running, it'll really only have one copy of the executable code for that program in memory -- but it'll have two or three sets of page descriptors that point to that same code so all of them can use it without making separate copies for each.
Of course, I'm simplifying a lot -- quite a few complete (and often fairly large) books have been written about virtual memory. There's also a fair amount of variation among machines, with various embellishments added, minor changes in parameters made (e.g., whether a page is 4K or 8K), and so on. Nonetheless, this is at least a general idea of the core of what happens (and it's still at a high enough level to apply about equally to an ARM, x86, MIPS, SPARC, etc.)
Bibliography
[Note: I haven't read all of these in their entirety, so I can't really vouch for their necessarily being great books.]
General References
OS References
CPU References
[Note: these links are likely to go stale. Sorry, but not much help for that.]
Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide, Part 1
ARM Architecture Reference Manual, Virtual Memory System Architecture
RISC V Privileged Specification
简而言之,它是一种保存比地址空间通常允许的更多数据的方法。也就是说,如果你有一个32位地址空间和4位虚拟地址,你可以保存(2^32)^(2^4)个地址(远远超过32位地址空间)。
Simply put, its a way of holding far more data than your address space would normally allow. I.e, if you have a 32 bit address space and 4 bit virtual address, you can hold (2^32)^(2^4) addresses (far more than a 32 bit address space).
分页是一种存储机制,允许操作系统以页面的形式将进程从辅助存储检索到主存中。在分页方法中,主存储器被分成固定大小的物理存储器小块,称为帧。帧的大小应与页的大小保持相同,以最大程度地利用主内存并避免外部碎片。
Paging is a storage mechanism that allows OS to retrieve processes from the secondary storage into the main memory in the form of pages. In the Paging method, the main memory is divided into small fixed-size blocks of physical memory, which is called frames. The size of a frame should be kept the same as that of a page to have maximum utilization of the main memory and to avoid external fragmentation.