init 永远不会收获僵尸/失效进程
在我的带有内核 2.6.18 的 Fedora Core 9 Web 服务器上,init 没有获取僵尸进程。如果不是进程表最终达到无法分配新进程的上限,这是可以忍受的。
ps -el | 的示例输出grep 'Z':
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
5 Z 0 2648 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
1 Z 51 2656 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
1 Z 0 2670 1 0 75 0 - 0 exit ? 00:00:02 crond <defunct>
4 Z 0 2874 1 0 82 0 - 0 exit ? 00:00:00 mysqld_safe <defunct>
5 Z 0 28104 1 0 76 0 - 0 exit ? 00:00:00 httpd <defunct>
5 Z 0 28716 1 0 76 0 - 0 exit ? 00:00:06 lfd <defunct>
5 Z 74 10172 1 0 75 0 - 0 exit ? 00:00:00 sshd <defunct>
5 Z 0 11199 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11202 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11205 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11208 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11211 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11240 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11246 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11249 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11252 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
1 Z 0 14106 1 0 80 0 - 0 exit ? 00:00:00 anacron <defunct>
5 Z 0 14631 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
这是操作系统错误吗?配置错误?我正在寻找关于这个问题根源的灵感。 谢谢
On my Fedora Core 9 webserver with kernel 2.6.18, init isn't reaping zombie processes. This would be bearable if it wasn't for the process table eventually reaching an upper limit where no new processes can be allocated.
Sample output of ps -el | grep 'Z'
:
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
5 Z 0 2648 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
1 Z 51 2656 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
1 Z 0 2670 1 0 75 0 - 0 exit ? 00:00:02 crond <defunct>
4 Z 0 2874 1 0 82 0 - 0 exit ? 00:00:00 mysqld_safe <defunct>
5 Z 0 28104 1 0 76 0 - 0 exit ? 00:00:00 httpd <defunct>
5 Z 0 28716 1 0 76 0 - 0 exit ? 00:00:06 lfd <defunct>
5 Z 74 10172 1 0 75 0 - 0 exit ? 00:00:00 sshd <defunct>
5 Z 0 11199 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11202 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11205 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11208 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11211 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11240 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11246 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11249 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
5 Z 0 11252 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
1 Z 0 14106 1 0 80 0 - 0 exit ? 00:00:00 anacron <defunct>
5 Z 0 14631 1 0 75 0 - 0 exit ? 00:00:00 sendmail <defunct>
Is this an OS bug? misconfiguration? I'm looking for inspiration as to the source of this problem.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这对我在 Ubuntu 上的影响有两个方面:
内核出了问题。就我而言,内核驱动程序崩溃了,进程内部变得疯狂。测试此问题的最佳方法是检查 /var/log/syslog (和 dmesg)以查看是否有任何问题 - 例如“BUG:无法在 0000000000000028 处处理内核 NULL 指针取消引用”,
另一次我已经这是当 init 不是“大多数用途的子进程的父进程”时看到的(实际手册页引用)。当您使用 ptrace 系统调用(strace 程序内部使用)附加到进程时,可能会发生这种情况。例如,我遇到了一种情况,我将 strace 附加到子进程 B。最终,进程 B 与其父进程一样终止(不确定顺序)。进程 B 看起来就像是 init 拥有的僵尸进程。然而,它的“大多数用途”父级实际上是 strace 程序。杀死 strace 后,进程 B 被收获
This has hit me on Ubuntu in 2 ways:
Something wrong with the kernel. In my case a kernel driver had crashed and process internals went bonkers. The best way to test this is checking /var/log/syslog (and dmesg) to see if anything looks awry - for example "BUG: unable to handle kernel NULL pointer dereference at 0000000000000028",
The other time I've seen this is when init is not the "parent of the child process for most purposes" (actual manpage quote). This can happen when you use the ptrace syscall (which the strace program uses internally) to attach on a process. For instance, I've gotten into a situation where I attach strace to child process B. Eventually, process B terminates as does its parent (not sure what order). Process B then looks like a zombie owned by init. However, its "most purposes" parent was actually the strace program. After killing the strace, process B was reaped