根据 vfork() 手册页,如果 vfork() 在调用 _exit 或 exec 系列系统调用之前修改除 pid_t 之外的任何数据,则行为未定义.
据我了解,如果vfork()
创建的子进程调用exec()
,那么它可以修改任何数据,其行为是仍然没有 不明确的。
我的问题是:
-
众所周知,子进程共享父地址空间,那么如果子进程使用 exec 覆盖自身和父映像,那么为什么行为不是未定义的?
-
如果子进程调用 exec 然后返回,父进程会发生什么?父级是否开始使用子级使用 exec 创建的新副本?
According to vfork()
man page, the behaviour is undefined if vfork()
modifies any data, other than pid_t before it calls either _exit or exec family of syscalls.
By this I understand, that if the child process created by vfork()
calls exec()
, then it can modify any data, and the behaviour is still not undefined.
My questions are:
-
It is also known that child shares parent address space, so how come if child overwrites, self and parent image using exec, the behaviour is not undefined?
-
What happens to parent, if the child calls exec and after that it returns? Does the parent start using the new copy, created by child using exec?
发布评论
评论(6)
exec
调用将子级的整个地址空间替换为全新的地址空间。任何共享地址空间都将被调用完全替换。vfork
函数仅作为优化而存在。对于某些操作系统,fork 非常昂贵,因为子进程可能会修改映射到内存的任何页面,因此必须修改每个页面以在写入时复制(或者,最初,实际复制!)不要修改父级的相应页面。一个非常常见的顺序是fork
紧接着exec
,迫使这些系统重新映射所有页面,然后在一瞬间将它们全部丢弃。vfork
允许您在子进程中将映射保留为未定义状态(假设您无论如何都不会使用它们),而不是费力修改所有映射。因此,在 vfork 之后执行某些操作可能会造成混乱。但是一旦你调用
exec
,所有未定义的映射都会消失。在实践中,操作系统处理
vfork
的方式有两种:对于将所有映射更改为写入时复制的成本较低或尚未实现vfork
优化的操作系统,vfork
与fork
相同。对于使用 vfork 优化的操作系统,vfork 让父级和子级完全共享大部分页面,如果子级修改它们(它们在父级中修改),就会导致不好的事情发生。 )。因此,对您问题的简短回答是,如果 vfork 是这样设计的,它就无法用于其唯一的预期目的。
The
exec
call replaces the child's entire address space with a whole new address space. Any shared address space would be replaced completely by the call.The
vfork
function exists only as an optimization. For some operating systems,fork
is very expensive because the child process could potentially modify any page mapped into memory, so every single page must be modified to copy on write (or, originally, actually copied!) so as not to modify the parent's corresponding pages. A very common sequence isfork
followed immediately byexec
, forcing these systems to remap all the pages just to throw them all away a split second later. Rather than going to the trouble of modifying all the mappings,vfork
allows you to leave the mappings in an undefined state in the child process under the assumption that you're not going to use them anyway.As a result, doing certain things after a
vfork
can create a mess. But as soon as you callexec
, all the undefined mappings are gone anyway.In practice, operating systems handle
vfork
one of two ways: For operating systems where changing all mappings to copy on write is inexpensive or that haven't implementedvfork
optimization,vfork
is identical tofork
. For operating systems that do usevfork
optimization,vfork
leaves the parent and child fully sharing most pages, causing bad things to happen if the child modifies them (they modify in the parent).So the short answer to your question is that if
vfork
was designed that way, it wouldn't be usable for its sole intended purpose.我认为您的主要误解是 exec 的作用:它不会用新进程“覆盖内存”。相反,它会丢弃整个虚拟内存(无论是以前的私有映射、共享映射还是其他),并为与新进程映像(可执行文件)相对应的调用进程 ID 创建一个全新的虚拟地址空间。除了内存管理结构上的引用计数减少(通过 vfork 增加)之外,这与父级的地址空间没有关系。
I think your key misunderstanding is what
exec
does: it does not "overwrite memory" with the new process. Rather it throws away its entire virtual memory (whether it was previously private mappings, shared mappings, or whatever) and creates a completely new virtual address space for the calling process id corresponding to the new process image (executable). This has no bearing on the parent's address space except that the reference count on the memory management structures is decremented (it was incremented byvfork
).vfork
实际上可能并不共享地址空间。是否这样做是明确未定义的。这是因为在现代操作系统上复制地址空间已经变得非常便宜,因此必须实现不复制地址空间的调用可能会带来更多麻烦。另外,如果 vfork 确实共享地址空间,它将共享堆栈。让一个进程在另一个进程不知情的情况下从共享堆栈中弹出项目是一个非常糟糕的主意。
exec
为进程创建一个全新的地址空间并“忘记”旧的地址空间。由于在 vfork 情况下,可能(或可能没有)两个进程使用该地址空间,因此其引用计数将减少,并且父进程将能够继续使用该地址空间。子进程无法从成功的
exec
中“返回”。成功exec
后,将创建一个新的地址空间,并在从main
开始的进程中开始执行。vfork
确实可能具有暂停父级直到子级执行exec
或exit
的效果。从这个意义上说,子进程可以从 exec 返回,因为父进程的执行如果已停止,它们将恢复。但即使在共享情况下,父进程的地址空间也保持不变,因为exec
或exit
情况都会导致对原始进程(父级)地址空间。vfork
may not actually share the address space. It is specifically undefined whether or not it does so. This is because duplicating the address space has become very cheap on modern operating systems so having to implement a call that doesn't may be more trouble than it's worth.Also, if
vfork
does share address space, it will be sharing the stack. Having one process pop items off a shared stack unbeknownst to the other is a very bad idea.exec
creates a brand new address space for the process and 'forgets' the old one. Since in avfork
situation there may (or may not) be two processes using that address space, a reference count on it will be decremented and the parent process will be able to continue using the address space just fine.A child process cannot 'return' from a successful
exec
. After a successfulexec
a new address space is created and execution begins in the process starting atmain
.vfork
does potentially have the effect of pausing the parent until the child executesexec
orexit
. In this sense a child can sort of return fromexec
because the execution of the parent process will them resume if it has been halted. But the address space of the parent process is left untouched even in the shared situation because either theexec
or theexit
case will result in simply one less reference to the original (the parent's) address space.我认为这是混淆的基本点:通常,fork 通过复制父地址空间来创建新的地址空间,而 exec 则用从加载的新地址空间替换调用者的地址空间。磁盘上的可执行文件。因此,如果 vfork 不复制父地址空间,那么在 vfork 之后调用 exec 又如何不会复制父地址空间呢?不会破坏父地址空间,使父进程无处可恢复执行吗?
答案是,这会使 vfork 变得毫无用处,因此内核会避免使用它。当从
vfork
的子端调用exec
时,它会创建一个新的地址空间,加载可执行文件那里,并留下调用地址单独的空间。然后子进程上下文切换到新的地址空间,父进程在其未修改的原始地址空间中恢复执行。vfork
的所有危险都源于子进程暂时在父进程的地址空间中执行,直到它调用exec
或_exit
>。孩子在那里所做的任何副作用都会持续存在,并影响父母,可能会造成灾难性的后果。除非您使用的系统中vfork
只是fork
的别名,在这种情况下它们不会粘在一起。因此,你不能指望这两种行为,并且必须避免对孩子做任何事情。I think this is the basic point of confusion: Normally,
fork
creates a new address space by duplicating the parent, andexec
replaces the caller's address space with a fresh one loaded from an executable on disk. So, ifvfork
doesn't duplicate the parent address space, how is it that callingexec
aftervfork
doesn't destroy the parent address space, leaving the parent with nowhere to resume execution?The answer is that that would make
vfork
useless, so the kernel avoids it. Whenexec
is called from the child side of avfork
, it creates a new address space, loads the executable there, and leaves the calling address space alone. The child process is then context-switched to the new address space, and the parent process resumes execution in its unmodified original address space.All of the danger of
vfork
stems from the child temporarily executing in the parent's address space until it callsexec
or_exit
. Any side effects of what the child does in there stick, and affect the parent, possibly catastrophically. Unless you're on a system wherevfork
is just an alias forfork
, in which case they don't stick. Thus you can't count on either behavior and you have to avoid doing anything in the child.vfork
是作为fork
+exec
的优化而发明的。整个想法是,“如果您的计划是调用fork()
,然后调用exec(...)
”,请使用vfork
,我们“我们将尽一切努力利用这一点并加快进程。”该限制是为了允许实现者获得最大的灵活性,包括如果您执行 exec 之外的任何操作,则可以产生任意意外。
孩子不能“调用 exec 然后返回”。执行家族没有返回。它取代了整个图像。所以你问题的第二部分无法回答。
vfork
was invented as an optimization forfork
+exec
. The whole idea was, 'if your plan is to callfork()
and thenexec(...)
', usevfork
and we'll do whatever we can to take advantage of that and speed things up.'The restriction is to allow implementors the maximum flexibility, including arbitrary surprises if you do anything other than exec.
A child can't 'call exec and then return'. The exec family does not return. It replaces the entire image. So the second part of your question isn't answerable.
vfork 实际上可能不会在单独的地址空间中运行分叉进程,因此它的行为更像是“线程”(除非没有并发执行或单独的堆栈)。这意味着除了 exec 或 _exit 之外,您必须在子进程中执行任何操作。
一些支持 vfork 的内核(uclinux?ELKS?)不支持 fork - 例如,在无 MMU 的系统上,支持 fork() 基本上是不可能的(即使通过复制页面)。每个进程都需要独立启动,因为它们都共享地址空间。
所以 vfork 可以在这些上正确实现,但 fork 不能。
vfork MIGHT not actually run the forked process in a separate address-space, so it behaves more like a "thread" (except without concurrent execution or a separate stack). This means that you have to do, well, nothing in the child except exec or _exit.
Some kernels (uclinux? ELKS?) which support vfork do not support fork - for example, on MMU-less systems, supporting fork() is essentially impossible (even by copying pages). Each process needs to be started independently, as they all share address-space.
So vfork can be correctly implemented on these, but fork can't.