fork 和内核中映射的用户空间内存的交互

发布于 2024-09-29 10:03:20 字数 1121 浏览 5 评论 0原文

考虑一个使用 get_user_pages （或 get_page）从调用进程映射页面的 Linux 驱动程序。然后，页面的物理地址被传递到硬件设备。进程和设备都可以读取和写入页面，直到双方决定结束通信。特别是，在调用get_user_pages的系统调用返回之后，通信可以继续使用页面。系统调用实际上是在进程和硬件设备之间设置一个共享内存区域。

我担心如果进程调用 fork 会发生什么（它可能来自另一个线程，并且可能在调用 get_user_pages 的系统调用时发生） > 正在进行中或稍后）。特别是，如果父进程在分叉后写入共享内存区域，我对底层物理地址（可能由于写时复制而更改）了解多少？我想了解：

内核需要做什么来防御潜在的行为不当的进程（我不想创建一个安全漏洞！）；
进程需要遵守哪些限制才能使我们的驱动程序的功能正常工作（即物理内存保持映射到父进程中的同一地址）。
- 理想情况下，我希望子进程根本不使用我们的驱动程序（它可能几乎立即调用 exec）的常见情况能够正常工作。
- 理想情况下，父进程在分配内存时不必采取任何特殊步骤，因为我们已有代码将堆栈分配的缓冲区传递给驱动程序。
- 我知道 madvise 和 MADV_DONTFORK，让内存从子进程的空间中消失是可以的，但它不适用于堆栈 -分配的缓冲区。
- “当您与我们的驱动程序保持有效连接时，请勿使用 fork”会很烦人，但如果满足第 1 点，作为最后的手段也是可以接受的。

我愿意被指出文档或源代码。我特别查看了 Linux 设备驱动程序，但没有发现此问题得到解决。即使仅应用于内核源代码的相关部分，RTFS 也有点让人不知所措。

内核版本尚未完全修复，而是最新版本（假设 ≥2.6.26）。如果重要的话，我们只针对 Arm 平台（到目前为止是单处理器，但多核即将到来）。

原文

Consider a Linux driver that uses get_user_pages (or get_page) to map pages from the calling process. The physical address of the pages are then passed to a hardware device. Both the process and the device may read and write to the pages until the parties decide to end the communication. In particular, the communication may continue using the pages after the system call that calls get_user_pages returns. The system call is in effect setting up a shared memory zone between the process and the hardware device.

I'm concerned about what happens if the process calls fork (it could be from another thread, and could happen either while the syscall that calls get_user_pages is in progress or later). In particular, if the parent writes to the shared memory area after the fork, what do I know about the underlying physical address (presumably changed due to copy-on-write)? I want to understand:

what the kernel needs to do to defend against a potentially misbehaving process (I don't want to create a security hole!);
what restrictions the process need to obey so that the functionality of our driver works correctly (i.e. the physical memory remains mapped at the same address in the parent process).
- Ideally, I would like the common case where the child process doesn't use our driver at all (it probably calls exec almost immediately) to work.
- Ideally, the parent process should not have to take any special steps when allocating the memory, as we have existing code that passes a stack-allocated buffer to the driver.
- I'm aware of madvise with MADV_DONTFORK, and it would be ok to have the memory disappear from the child process's space, but it's not applicable to a stack-allocated buffer.
- “Don't use fork while you have a connection active with our driver” would be annoying, but acceptable as a last resort if point 1 is satisfied.

I'm willing to be pointed to documentation or source code. I've looked in particular at Linux Device Drivers, but didn't find this issue addressed. RTFS applied to even just the relevant part of the kernel source is a bit overwhelming.

The kernel version is not completely fixed but is a recent one (let's say ≥2.6.26). We're only targetting Arm platforms (single-processor so far but multicore is just round the corner), if it matters.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

痴梦一场 2024-10-06 10:03:20

fork() 不会干扰 get_user_pages()：get_user_pages() 将为您提供一个struct page。

在能够访问它之前，您需要 kmap() 它，并且此映射是在内核空间而不是用户空间中完成的。

编辑： get_user_pages() 触摸页表，但您不应该担心这一点（它只是确保页面映射在用户空间中），如果这样做有任何问题，则返回 -EFAULT 。

如果您 fork()，直到执行写入时复制，子级将能够看到该页面。
一旦写时复制完成（因为子进程/驱动程序/父进程通过用户空间映射写入页面——而不是驱动程序具有的内核 kmap()），该页面将不再被共享。如果您仍然在页面上（在驱动程序代码中）持有 kmap()，您将无法知道您持有的是父页面还是子页面。

1）这不是一个安全漏洞，因为一旦你执行了 execve()，所有这些都消失了。

2）当您 fork() 时，您希望两个进程相同（这是一个分叉！！）。我认为你的设计应该允许父母和孩子都访问驱动程序。 Execve() 将刷新所有内容。

在用户空间中添加一些功能怎么样：

 f = open("/dev/your_thing")
 mapping = mmap(f, ...)

当在设备上调用 mmap() 时，您将安装带有特殊标志的内存映射：
http://os1a.cs.columbia.edu/lxr /source/include/linux/mm.h#071

您有一些有趣的事情，例如：

#define VM_SHARED       0x00000008
#define VM_LOCKED       0x00002000
#define VM_DONTCOPY     0x00020000      /* Do not copy this vma on fork */

VM_SHARED 将禁用写入时复制
VM_LOCKED 将禁用该页面上的交换
VM_DONTCOPY 会告诉内核不要复制 fork 上的 vma 区域，尽管我认为这不是一个好主意

A fork() will not interfere with get_user_pages(): get_user_pages() will give you a struct page.

You would need to kmap() it before being able to access it, and this mapping is done in kernel space, not userspace.

EDIT: get_user_pages() touch the page table, but you should not be worried about this (it just make sure that the pages are mapped in userspace), and returns -EFAULT if it had any problem doing so.

If you fork(), until copy-on-write is performed, the child will be able to see that page.
Once copy-on-write is done (because the child/the driver/the parent wrote to the page through the userspace mapping -- not the kernel kmap() the driver has), that page will no longer be shared. If you still hold a kmap() on the page (in the driver code), you will not be able to know if you are holding the parent page or the child's.

1) It's not a security hole, because once you execve(), all of that is gone.

2) When you fork() you want both process to be identical (It's a fork !!). I would think that your design should allow both the parent and the child to access the driver. Execve() will flush everything.

What about adding some functionality in userspace like:

 f = open("/dev/your_thing")
 mapping = mmap(f, ...)

When mmap() is called on your device, you install a memory mapping, with special flags:
http://os1a.cs.columbia.edu/lxr/source/include/linux/mm.h#071

You have some interesting things like:

#define VM_SHARED       0x00000008
#define VM_LOCKED       0x00002000
#define VM_DONTCOPY     0x00020000      /* Do not copy this vma on fork */

VM_SHARED will disable copy on write
VM_LOCKED will disable swapping on that page
VM_DONTCOPY will tell the kernel not to copy the vma region on fork, although I don't think it's a good idea

回复收藏 0 原文