remap_pfn_range如何将内核内存重新映射到用户空间?
remap_pfn_range
函数(在驱动程序中的 mmap
调用中使用)可用于将内核内存映射到用户空间。它是如何完成的?谁能解释一下具体的步骤吗?内核模式是特权模式(PM),而用户空间是非特权模式(NPM)。在 PM 中,CPU 可以访问所有内存,而在 NPM 中,某些内存受到限制 - CPU 无法访问。当调用 remap_pfn_range 时,以前仅限于 PM 的内存范围现在如何可供用户空间访问?
查看 remap_pfn_range
代码,其中有 pgprot_t struct
。这是与保护映射相关的结构。什么是保护映射?这是上面问题的答案吗?
remap_pfn_range
function (used in mmap
call in driver) can be used to map kernel memory to user space. How is it done? Can anyone explain precise steps? Kernel Mode is a privileged mode (PM) while user space is non privileged (NPM). In PM CPU can access all memory while in NPM some memory is restricted - cannot be accessed by CPU. When remap_pfn_range
is called, how is that range of memory which was restricted only to PM is now accessible to user space?
Looking at remap_pfn_range
code there is pgprot_t struct
. This is protection mapping related struct. What is protection mapping? Is it the answer to above question?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
其实很简单,内核内存(通常)只有一个页表条目,其中包含特定于体系结构的位,表示:“此页表条目仅在 CPU 处于内核模式时有效”。
remap_pfn_range 的作用是创建另一个页表条目,对没有设置该位的同一物理内存页使用不同的虚拟地址。
通常,这是一个坏主意顺便说一句:-)
It's simple really, kernel memory (usually) simply has a page table entry with the architecture specific bit that says: "this page table entry is only valid while the CPU is in kernel mode".
What remap_pfn_range does is create another page table entry, with a different virtual address to the same physical memory page that doesn't have that bit set.
Usually, it's a bad idea btw :-)
该机制的核心是页表MMU:
相关图片1 http://windowsitpro.com/content /content/3686/figure_01.gif
或这个:
上面两张图片都是特征x86硬件内存MMU,与Linux内核无关。
下面描述了 VMA 如何链接到进程的 task_struct:
相关图片http://image9.360doc.com/DownloadImg/2010/05/0320/3083800_2.gif
(来源:slideplayer.com)
和在这里查看函数本身:
http://lxr.free-electrons.com/source/mm/memory.c#L1756
物理内存中的数据可以通过内核的PTE被内核访问,如下图:
(来源:tldp.org)
但是在调用 remap_pfn_range() 之后,会派生出一个 PTE(用于现有的内核内存,但要在用户空间中用于访问它)(具有不同的页面保护标志)。进程的 VMA 内存将被更新,以使用此 PTE 访问相同的内存 - 从而最大限度地减少通过复制浪费内存的需要。但是内核和用户空间的PTE有不同的属性——用于控制对物理内存的访问,VMA也会在进程级别指定这些属性:
The core of the mechanism is page table MMU:
Related image1 http://windowsitpro.com/content/content/3686/figure_01.gif
or this:
Both picture above are characteristics of x86 hardware memory MMU, nothing to do with Linux kernel.
Below described how the VMAs is linked to the process's task_struct:
Related image http://image9.360doc.com/DownloadImg/2010/05/0320/3083800_2.gif
(source: slideplayer.com)
And looking into the function itself here:
http://lxr.free-electrons.com/source/mm/memory.c#L1756
The data in physical memory can be accessed by the kernel through the kernel's PTE, as shown below:
(source: tldp.org)
But after calling remap_pfn_range() a PTE (for an existing kernel memory but to be used in userspace to access it) is derived (with different page protection flags). The process's VMA memory will be updated to use this PTE to access the same memory - thus minimizing the need to waste memory by copying. But kernel and userspace PTE have different attributes - which is used to control the access to the physical memory, and the VMA will also specified the attributes at the process level:
Peter Teoh 2012 解释的 PTE 对象的内部簿记就是这样,因此 Linux 内核可以在各种当前的硬件上运行。但 TO 特别询问了 (1) 字符设备驱动程序和 (2) 内存保护和 pgprot_t 对象。
当用户登陆进程访问不拥有硬件的内存时,会创建页面错误。内核捕获它并在 do_page_fault 中要么使该页面可供进程使用,要么终止它以防止损坏,因为显然存在编程错误。
我们可以通过将内核虚拟地址(例如,从已知物理地址、静态内存或由
kmalloc
、get_free_page
等分配的内存转换而来)重新映射到用户空间进程来防止这种情况发生。 但是具体时间和地点? TO 提到了一个实现mmap
的字符驱动程序,它是驱动程序在加载时填充的file_operations
对象中的函数指针。当用户登陆进程从标准 C 库调用mmap
并将设备驱动程序的文件路径作为参数传递给它时,内核将调用已注册的mmap
函数。remap_pfn_range
正是在该函数中被调用。 TO 的问题是这到底是如何运作的:“谁能解释一下精确的步骤吗?”好吧,这就是语言开始失效的地方,我们必须转向源代码。这是 /dev/mem 的 mmap 实现 由托沃兹本人撰写。正如您所看到的,它基本上是
remap_pfn_range< 的包装器/代码>
。不幸的是,如果你查看代码,你不会看到任何魔法发生,只是更多的簿记。它为用户进程保留页面并修改 pgprot_t 值。
这可能是它的工作原理:从进程第一次访问映射内存最初会生成页面错误,但现在 do_page_fault 知道页面属于哪个进程。
为了将页面返回给内核,进程调用 munmap。
这是一个复杂的话题,所以如果有人能证实/批评/扩展我的陈述,那就太好了。
The internal bookkeeping with PTE objects explained by Peter Teoh 2012 is just how it is, so that the Linux kernel runs on various, current hardware. But the TO asked specifically about (1) character device drivers and (2) memory protection and
pgprot_t
objects.When a user land process accesses memory it does not own the hardware creates a page fault. The kernel catches it and in
do_page_fault
either makes the page available to the process, or kills it to prevent damage because there clearly is a programming error.We can prevent that by remapping kernel virtual addresses (e.g. converted from known physical addresses, static memory or memory allocated by
kmalloc
,get_free_page
and friends) to the user space process. But when and where exactly? The TO mentioned a character driver that implementsmmap
, which is a function pointer in thefile_operations
object that the driver fills in when it loads. When a user land process callsmmap
from the Standard C Library passing it the file path of the device driver as an argument, the kernel calls the registeredmmap
function.It is within that function that
remap_pfn_range
is called. And the TO’s question was how exactly this works: "Can anyone explain precise steps?"Well, this is where words start to fail and we have to turn to source code. Here is the mmap implementation of /dev/mem by Torvalds himself. As you can see it's basically a wrapper around
remap_pfn_range
. Unfortunately, if you look into the code you don't see any magic happening, just more bookkeeping. It reserves the page for the user process and modifies thepgprot_t
value.This is probably how it works: the first access to the mapped memory from the process initially generates a page fault, but now do_page_fault knows which process the page belongs to.
To return the page to the kernel, the process calls
munmap
.This is a complex topic and so it would be good if someone would confirm/criticize/expand on my statements.