init 永远不会收获僵尸/失效进程

发布于 2024-08-14 07:22:09 字数 1889 浏览 11 评论 0原文

在我的带有内核 2.6.18 的 Fedora Core 9 Web 服务器上,init 没有获取僵尸进程。如果不是进程表最终达到无法分配新进程的上限,这是可以忍受的。

ps -el | 的示例输出grep 'Z':

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
5 Z     0  2648     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z    51  2656     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z     0  2670     1  0  75   0 -     0 exit   ?        00:00:02 crond <defunct>
4 Z     0  2874     1  0  82   0 -     0 exit   ?        00:00:00 mysqld_safe <defunct>
5 Z     0 28104     1  0  76   0 -     0 exit   ?        00:00:00 httpd <defunct>
5 Z     0 28716     1  0  76   0 -     0 exit   ?        00:00:06 lfd <defunct>
5 Z    74 10172     1  0  75   0 -     0 exit   ?        00:00:00 sshd <defunct>
5 Z     0 11199     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11202     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11205     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11208     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11211     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11240     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11246     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11249     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11252     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z     0 14106     1  0  80   0 -     0 exit   ?        00:00:00 anacron <defunct>
5 Z     0 14631     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>

这是操作系统错误吗?配置错误?我正在寻找关于这个问题根源的灵感。 谢谢

On my Fedora Core 9 webserver with kernel 2.6.18, init isn't reaping zombie processes. This would be bearable if it wasn't for the process table eventually reaching an upper limit where no new processes can be allocated.

Sample output of ps -el | grep 'Z':

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
5 Z     0  2648     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z    51  2656     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z     0  2670     1  0  75   0 -     0 exit   ?        00:00:02 crond <defunct>
4 Z     0  2874     1  0  82   0 -     0 exit   ?        00:00:00 mysqld_safe <defunct>
5 Z     0 28104     1  0  76   0 -     0 exit   ?        00:00:00 httpd <defunct>
5 Z     0 28716     1  0  76   0 -     0 exit   ?        00:00:06 lfd <defunct>
5 Z    74 10172     1  0  75   0 -     0 exit   ?        00:00:00 sshd <defunct>
5 Z     0 11199     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11202     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11205     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11208     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11211     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11240     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11246     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11249     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11252     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z     0 14106     1  0  80   0 -     0 exit   ?        00:00:00 anacron <defunct>
5 Z     0 14631     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>

Is this an OS bug? misconfiguration? I'm looking for inspiration as to the source of this problem.
Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寄意 2024-08-21 07:22:09

这对我在 Ubuntu 上的影响有两个方面:

  1. 内核出了问题。就我而言,内核驱动程序崩溃了,进程内部变得疯狂。测试此问题的最佳方法是检查 /var/log/syslog (和 dmesg)以查看是否有任何问题 - 例如“BUG:无法在 0000000000000028 处处理内核 NULL 指针取消引用”,

  2. 另一次我已经这是当 init 不是“大多数用途的子进程的父进程”时看到的(实际手册页引用)。当您使用 ptrace 系统调用(strace 程序内部使用)附加到进程时,可能会发生这种情况。例如,我遇到了一种情况,我将 strace 附加到子进程 B。最终,进程 B 与其父进程一样终止(不确定顺序)。进程 B 看起来就像是 init 拥有的僵尸进程。然而,它的“大多数用途”父级实际上是 strace 程序。杀死 strace 后,进程 B 被收获

This has hit me on Ubuntu in 2 ways:

  1. Something wrong with the kernel. In my case a kernel driver had crashed and process internals went bonkers. The best way to test this is checking /var/log/syslog (and dmesg) to see if anything looks awry - for example "BUG: unable to handle kernel NULL pointer dereference at 0000000000000028",

  2. The other time I've seen this is when init is not the "parent of the child process for most purposes" (actual manpage quote). This can happen when you use the ptrace syscall (which the strace program uses internally) to attach on a process. For instance, I've gotten into a situation where I attach strace to child process B. Eventually, process B terminates as does its parent (not sure what order). Process B then looks like a zombie owned by init. However, its "most purposes" parent was actually the strace program. After killing the strace, process B was reaped

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文