当我在 Unix 中调用 fork() 时会发生什么?

发布于 2024-12-05 05:52:04 字数 250 浏览 0 评论 0原文

我试图查找这一点,但在调用 fork() 后,我很难理解父进程和子进程之间的关系。

它们是完全独立的进程,仅通过 id/parent id 关联吗?或者他们共享记忆吗?例如,每个进程的“代码”部分 - 是否是重复的,以便每个进程都有自己相同的副本,或者是否以某种方式“共享”,以便只有一个存在?

我希望这是有道理的。

以充分披露的名义,这是“与作业相关的”;虽然这不是书中的直接问题,但我感觉这主要是学术性的,在实践中,我可能不需要知道。

I've tried to look this up, but I'm struggling a bit to understand the relation between the Parent Process and the Child Process immediately after I call fork().

Are they completely separate processes, only associated by the id/parent id? Or do they share memory? For example the 'code' section of each process - is that duplicated so that each process has it's own identical copy, or is that 'shared' in some way so that only one exists?

I hope that makes sense.

In the name of full disclosure this is 'homework related'; while not a direct question from the book, I have a feeling it's mostly academic and, in practice, I probably don't need to know.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

○闲身 2024-12-12 05:52:04

似乎在过程中,整个内存都是重复的。

实际上,它使用“写入”系统。首次在fork()之后首次更改其内存时,将单独的副本(通常为4KB)进行。

通常,一个过程的代码段不会被修改,在这种情况下,它仍然共享。

As it appears to the process, the entire memory is duplicated.

In reality, it uses "copy on write" system. The first time either process changes its memory after fork(), a separate copy is made of the modified page (usually 4kB).

Usually the code segment of a process is not modified, in which case it remains shared.

余罪 2024-12-12 05:52:04

从逻辑上讲,分叉会创建与原始过程相同的副本,并且在很大程度上独立于原始过程。出于性能原因,内存通过写时复制语义进行共享,这意味着未修改的内存(例如代码)仍然是共享的。

文件描述符是重复的,因此分叉进程原则上可以代表父进程接管数据库连接(或者,如果程序员有点扭曲,它们甚至可以联合与数据库通信)。更常见的是,这用于在进程之间设置管道,因此您可以编写 find -name '*.c' | xargs grep fork

还有很多其他的东西是共享的。有关详细信息,请参阅此处

一个重要的遗漏是线程——子进程仅继承调用fork()的线程。这在多线程程序中会导致无穷无尽的麻烦,因为在父级中锁定的互斥体等的状态是特定于实现的(并且不要忘记 malloc()printf() 在内部使用锁)。在 fork() 返回后,子进程中唯一安全的做法是尽快调用 execve(),即便如此,您也必须对文件描述符保持谨慎。请参阅此处了解完整的恐怖故事。

Logically, a fork creates an identical copy of the original process that is largely independent of the original. For performance reasons, memory is shared with copy-on-write semantics, which means that unmodified memory (such as code) remains shared.

File descriptors are duplicated, so that the forked process could, in principle, take over a database connection on behalf of the parent (or they could even jointly communicate with the database if the programmer is a bit twisted). More commonly, this is used to set up pipes between processes so you can write find -name '*.c' | xargs grep fork.

A bunch of other stuff is shared. See here for details.

One important omission is threads — the child process only inherits the thread that called fork(). This causes no end of trouble in multithreaded programs, since the status of mutexes, etc., that were locked in the parent is implementation-specific (and don't forget that malloc() and printf() use locks internally). The only safe thing to do in the child after fork() returns is to call execve() as soon as possible, and even then you have to be cautious with file descriptors. See here for the full horror story.

子栖 2024-12-12 05:52:04
  1. 它们是单独的进程,即子进程和父进程将具有单独的 PID。
  2. 子进程将从父进程继承所有打开的描述符。
  3. 内部页面(即可以修改的堆栈/堆区域)与 .text 区域不同,将共享 b/ w 父级和子级,直到其中之一尝试修改内容。在这种情况下,会创建一个新页面,并将特定于正在修改的页面的数据复制到这个新分配的页面,并映射到与导致更改的页面相对应的区域(可以是父页面或子页面)。这称为 COW(本论坛的其他成员在上面的回答中提到过)。
  4. 子进程可以完成执行,直到父进程使用 wait() 或 waitpid() 调用回收为止,子进程将处于 ZOMBIE 状态。这将有助于从进程表中清除子进程条目并允许子进程 pid 被重用。通常,当子进程死亡时,SIGCHLD 信号会发送到父进程,理想情况下,这会导致调用处理程序,然后在该处理程序中执行 wait() 调用。
  5. 如果父进程退出时没有清理已经运行的或僵尸子进程(通过 wait() waitpid 调用),则 init() 进程 (PID 1) 将成为这些现在的孤儿进程的父进程。此 init() 进程定期执行 wait() 或 waitpid() 调用。

编辑:错别字
华泰

  1. They are separate processes i.e. the Child and the Parent will have separate PIDs
  2. The child will inherit all of the open descriptors from the Parent
  3. Internally the pages i.e. the stack/heap regions which can be modified unlike the .text region, will be shared b/w the Parent and the Child until one of them tries to modify the contents. In such cases a new page is created and data specific to the page being modified is copied to this freshly allocated page and mapped to the region corresponding to the one who caused the change - could be either the Parent or Child. This is called COW (mentioned by other members in this forum above in their answers).
  4. The Child can finish execution and until reclaimed by the parent using the wait() or waitpid() calls will be in ZOMBIE state. This will help clear the child's process entry from the process table and allow the child pid to be reused. Usually when a child dies, the SIGCHLD signal is sent out to the parent which would ideally result in a handler being called subsequent to which the wait() call is executed in that handler.
  5. In case the Parent exits without cleaning up the already running or zombie child (via the wait() waitpid calls), the init() process (PID 1) becomes the parent to these now orphan children. This init() process executes wait() or waitpid() calls at regular intervals.

EDIT: typos
HTH

黎歌 2024-12-12 05:52:04

是的,它们是独立的进程,但具有一些特殊的“属性”。其中之一就是子女与父母的关系。

但更重要的是以写时复制(COW)方式共享内存页面:直到其中一个在页面上执行写入(对全局变量或其他内容)为止,内存页面都是共享的。当执行写入时,内核会创建该页面的副本并将其映射到正确的地址。

COW 魔法是在内核中通过将页面标记为只读并使用故障机制来完成的。

Yes, they are separate processes, but with some special "properties". One of them is the child-parent relation.

But more important is the sharing of memory pages in a copy-on-write (COW) manner: until the one of them performs a write (to a global variable or whatever) on a page, the memory pages are shared. When a write is performed, a copy of that page is created by the kernel and mapped at the right address.

The COW magic is done by in the kernel by marking the pages as read-only and using the fault mechanism.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文