remap_pfn_range如何将内核内存重新映射到用户空间?

发布于 2024-12-26 00:11:04 字数 346 浏览 1 评论 0原文

remap_pfn_range 函数(在驱动程序中的 mmap 调用中使用)可用于将内核内存映射到用户空间。它是如何完成的?谁能解释一下具体的步骤吗?内核模式是特权模式(PM),而用户空间是非特权模式(NPM)。在 PM 中,CPU 可以访问所有内存,而在 NPM 中,某些内存受到限制 - CPU 无法访问。当调用 remap_pfn_range 时,以前仅限于 PM 的内存范围现在如何可供用户空间访问?

查看 remap_pfn_range 代码,其中有 pgprot_t struct。这是与保护映射相关的结构。什么是保护映射?这是上面问题的答案吗?

remap_pfn_range function (used in mmap call in driver) can be used to map kernel memory to user space. How is it done? Can anyone explain precise steps? Kernel Mode is a privileged mode (PM) while user space is non privileged (NPM). In PM CPU can access all memory while in NPM some memory is restricted - cannot be accessed by CPU. When remap_pfn_range is called, how is that range of memory which was restricted only to PM is now accessible to user space?

Looking at remap_pfn_range code there is pgprot_t struct. This is protection mapping related struct. What is protection mapping? Is it the answer to above question?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

八巷 2025-01-02 00:11:04

其实很简单,内核内存(通常)只有一个页表条目,其中包含特定于体系结构的位,表示:“此页表条目仅在 CPU 处于内核模式时有效”。

remap_pfn_range 的作用是创建另一个页表条目,对没有设置该位的同一物理内存页使用不同的虚拟地址。

通常,这是一个坏主意顺便说一句:-)

It's simple really, kernel memory (usually) simply has a page table entry with the architecture specific bit that says: "this page table entry is only valid while the CPU is in kernel mode".

What remap_pfn_range does is create another page table entry, with a different virtual address to the same physical memory page that doesn't have that bit set.

Usually, it's a bad idea btw :-)

丘比特射中我 2025-01-02 00:11:04

该机制的核心是页表MMU:

相关图片1 http://windowsitpro.com/content /content/3686/figure_01.gif

或这个:

相关图片

上面两张图片都是特征x86硬件内存MMU,与Linux内核无关。

下面描述了 VMA 如何链接到进程的 task_struct:

相关图片http://image9.360doc.com/DownloadImg/2010/05/0320/3083800_2.gif

相关图片
(来源:slideplayer.com

和在这里查看函数本身:

http://lxr.free-electrons.com/source/mm/memory.c#L1756

物理内存中的数据可以通过内核的PTE被内核访问,如下图:

图像结果用于页面保护标记 Linux 内核
(来源:tldp.org

但是在调用 remap_pfn_range() 之后,会派生出一个 PTE(用于现有的内核内存,但要在用户空间中用于访问它)(具有不同的页面保护标志)。进程的 VMA 内存将被更新,以使用此 PTE 访问相同的内存 - 从而最大限度地减少通过复制浪费内存的需要。但是内核和用户空间的PTE有不同的属性——用于控制对物理内存的访问,VMA也会在进程级别指定这些属性:

vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;

The core of the mechanism is page table MMU:

Related image1 http://windowsitpro.com/content/content/3686/figure_01.gif

or this:

Related image

Both picture above are characteristics of x86 hardware memory MMU, nothing to do with Linux kernel.

Below described how the VMAs is linked to the process's task_struct:

Related image http://image9.360doc.com/DownloadImg/2010/05/0320/3083800_2.gif

Related image
(source: slideplayer.com)

And looking into the function itself here:

http://lxr.free-electrons.com/source/mm/memory.c#L1756

The data in physical memory can be accessed by the kernel through the kernel's PTE, as shown below:

Image result for page protection flags linux kernel
(source: tldp.org)

But after calling remap_pfn_range() a PTE (for an existing kernel memory but to be used in userspace to access it) is derived (with different page protection flags). The process's VMA memory will be updated to use this PTE to access the same memory - thus minimizing the need to waste memory by copying. But kernel and userspace PTE have different attributes - which is used to control the access to the physical memory, and the VMA will also specified the attributes at the process level:

vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;

桃酥萝莉 2025-01-02 00:11:04

Peter Teoh 2012 解释的 PTE 对象的内部簿记就是这样,因此 Linux 内核可以在各种当前的硬件上运行。但 TO 特别询问了 (1) 字符设备驱动程序和 (2) 内存保护和 pgprot_t 对象。

当用户登陆进程访问不拥有硬件的内存时,会创建页面错误。内核捕获它并在 do_page_fault 中要么使该页面可供进程使用,要么终止它以防止损坏,因为显然存在编程错误。

我们可以通过将内核虚拟地址(例如,从已知物理地址、静态内存或由 kmallocget_free_page 等分配的内存转换而来)重新映射到用户空间进程来防止这种情况发生。 但是具体时间和地点? TO 提到了一个实现 mmap 的字符驱动程序,它是驱动程序在加载时填充的 file_operations 对象中的函数指针。当用户登陆进程从标准 C 库调用 mmap 并将设备驱动程序的文件路径作为参数传递给它时,内核将调用已注册的 mmap 函数。

remap_pfn_range 正是在该函数中被调用。 TO 的问题是这到底是如何运作的:“谁能解释一下精确的步骤吗?”

好吧,这就是语言开始失效的地方,我们必须转向源代码。这是 /dev/mem 的 mmap 实现 由托沃兹本人撰写。正如您所看到的,它基本上是 remap_pfn_range< 的包装器/代码>。不幸的是,如果你查看代码,你不会看到任何魔法发生,只是更多的簿记。它为用户进程保留页面并修改 pgprot_t 值。

这可能是它的工作原理:从进程第一次访问映射内存最初会生成页面错误,但现在 do_page_fault 知道页面属于哪个进程。

为了将页面返回给内核,进程调用 munmap。

这是一个复杂的话题,所以如果有人能证实/批评/扩展我的陈述,那就太好了。

The internal bookkeeping with PTE objects explained by Peter Teoh 2012 is just how it is, so that the Linux kernel runs on various, current hardware. But the TO asked specifically about (1) character device drivers and (2) memory protection and pgprot_t objects.

When a user land process accesses memory it does not own the hardware creates a page fault. The kernel catches it and in do_page_fault either makes the page available to the process, or kills it to prevent damage because there clearly is a programming error.

We can prevent that by remapping kernel virtual addresses (e.g. converted from known physical addresses, static memory or memory allocated by kmalloc, get_free_page and friends) to the user space process. But when and where exactly? The TO mentioned a character driver that implements mmap, which is a function pointer in the file_operations object that the driver fills in when it loads. When a user land process calls mmap from the Standard C Library passing it the file path of the device driver as an argument, the kernel calls the registered mmap function.

It is within that function that remap_pfn_range is called. And the TO’s question was how exactly this works: "Can anyone explain precise steps?"

Well, this is where words start to fail and we have to turn to source code. Here is the mmap implementation of /dev/mem by Torvalds himself. As you can see it's basically a wrapper around remap_pfn_range. Unfortunately, if you look into the code you don't see any magic happening, just more bookkeeping. It reserves the page for the user process and modifies the pgprot_t value.

This is probably how it works: the first access to the mapped memory from the process initially generates a page fault, but now do_page_fault knows which process the page belongs to.

To return the page to the kernel, the process calls munmap.

This is a complex topic and so it would be good if someone would confirm/criticize/expand on my statements.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文