当前位置：文江博客话题详情

Apache Perl zombie-process

如何避免在 Apache 1.3 下运行的 Perl CGI 脚本中出现僵尸？

发布于 2024-08-17 14:55:16 字数 1100 浏览 6 评论 0 原文

各种 Perl 脚本（服务器端包含）正在调用网站上具有许多功能的 Perl 模块。 编辑： 这些脚本使用 use lib 来引用文件夹中的库。在繁忙期间，脚本（而不是库）会变成僵尸并导致服务器过载。

服务器列表：

319 ?        Z      0:00 [scriptname1.pl] <defunct>    
320 ?        Z      0:00 [scriptname2.pl] <defunct>    
321 ?        Z      0:00 [scriptname3.pl] <defunct>

我每个都有数百个实例。

编辑： 除了 SSI 指令之外，我们没有使用 fork、system 或 exec

<!--#exec cgi="/cgi-bin/scriptname.pl"-->

据我所知，在这种情况下，httpd 本身将是进程的所有者。 MaxRequestPerChild 设置为 0，这不应该让父进程在子进程完成之前死亡。

到目前为止，我们认为暂时挂起一些脚本可以帮助服务器处理已失效的进程并防止其崩溃，但毫无疑问僵尸进程仍在形成。显然，gbacon 的理论似乎最接近事实，即服务器无法应对负载。

什么可能导致 httpd 放弃这些进程？是否有任何最佳实践可以防止这些情况发生？

谢谢

回答： 重点是罗布。正如他所说，生成 SSI 的 CGI 脚本不会处理这些 SSI。在 Apache 1.3 请求周期中，SSI 的评估发生在 CGI 运行之前。 Apache 2.0 及更高版本已修复此问题，以便 CGI 可以生成 SSI 命令。

由于我们在 Apache 1.3 上运行，因此对于每个页面视图，SSI 都会变成失效进程。尽管服务器试图清除它们，但由于正在运行的任务太忙而无法成功。结果，服务器摔倒并变得无响应。作为短期解决方案，我们审查了所有 SSI，并将一些进程移至客户端，以释放服务器资源并给其时间进行清理。后来我们升级到Apache 2.2。

原文

Various Perl scripts (Server Side Includes) are calling a Perl module with many functions on a website.
EDIT:
The scripts are using use lib to reference the libraries from a folder.
During busy periods the scripts (not the libraries) become zombies and overload the server.

The server lists:

319 ?        Z      0:00 [scriptname1.pl] <defunct>    
320 ?        Z      0:00 [scriptname2.pl] <defunct>    
321 ?        Z      0:00 [scriptname3.pl] <defunct>

I have hundreds of instances of each.

EDIT:
We are not using fork, system or exec, apart form the SSI directive

<!--#exec cgi="/cgi-bin/scriptname.pl"-->

As far as I know, in this case httpd itself will be the owner of the process.
MaxRequestPerChild is set to 0 which should not let the parents die before the child process is finished.

So far we figured that temporarily suspending some of the scripts help the server coping with the defunct processes and prevent it from falling over however zombie processes are still forming without a doubt.
Apparently gbacon seems to be the closest to the truth with his theory that the server is not being able to cope with the load.

What could lead to httpd abandoning these processes?
Is there any best practice to prevent these from happening?

Thanks

Answer:
The point goes to Rob.
As he says, CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.

Since we were running on Apache 1.3, for every page view the SSI's turned into defunct processes. Although the server was trying to clear them it was way too busy with the running tasks to be able to succeed. As a result, the server fell over and become unresponsive.
As a short term solution we reviewed all SSI's and moved some of the processes to client side to free up server resources and give it time to clean up.
Later we upgraded to Apache 2.2.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

心凉怎暖 2024-08-24 14:55:16

比最佳实践更多的创可贴，但有时您可以根据

$SIG{CHLD} = "IGNORE";

perlipc 文档

在大多数 Unix 平台上，CHLD（有时也称为 CLD）信号对于 'IGNORE' 值具有特殊行为>。在此类平台上将 $SIG{CHLD} 设置为 'IGNORE' 具有在父进程未能 wait() 其子进程（即，子进程会自动收获）。在此类平台上，调用将 $SIG{CHLD} 设置为 'IGNORE' 的 wait() 通常会返回 -1。

如果您关心子进程的退出状态，则需要通过调用 等待 或 waitpid 。尽管名字令人毛骨悚然，僵尸只是一个已经退出但尚未获得状态的子进程。

如果您的 Perl 程序本身是成为僵尸的子进程，则意味着它们的父进程（那些分叉并忘记您的代码的进程）需要自行清理。进程无法阻止自己变成僵尸。

More Band-Aid than best practice, but sometimes you can get away with simple

$SIG{CHLD} = "IGNORE";

According to the perlipc documentation

On most Unix platforms, the CHLD (sometimes also known as CLD) signal has special behavior with respect to a value of 'IGNORE'. Setting $SIG{CHLD} to 'IGNORE' on such a platform has the effect of not creating zombie processes when the parent process fails to wait() on its child processes (i.e., child processes are automatically reaped). Calling wait() with $SIG{CHLD} set to 'IGNORE' usually returns -1 on such platforms.

If you care about the exit statuses of child processes, you need to collect them (commonly referred to as "reaping") by calling wait or waitpid. Despite the creepy name, a zombie is merely a child process that has exited but whose status has not yet been reaped.

If your Perl programs themselves are the child processes becoming zombies, that means their parents (the ones that are forking-and-forgetting your code) need to clean up after themselves. A process cannot stop itself from becoming a zombie.

回复收藏 0 原文

∞觅青森が 2024-08-24 14:55:16

我刚刚看到您的评论，您正在运行 Apache 1.3，这可能与您的问题有关。

SSI 可以运行 CGI。但生成 SSI 的 CGI 脚本不会处理这些 SSI。在 Apache 1.3 请求周期中，SSI 的评估发生在 CGI 运行之前。 Apache 2.0 及更高版本已修复此问题，以便 CGI 可以生成 SSI 命令。

正如我上面所建议的，尝试单独运行脚本并查看输出。他们会生成 SSI 吗？

编辑：您是否尝试过启动一个简单的 Perl CGI 脚本来简单地打印出 Hello World 类型的 HTTP 响应？

然后，如果这有效，请添加一个简单的 SSI 指令，例如

<!--#printenv -->

并看看会发生什么。

编辑2：刚刚意识到可能发生了什么。当子进程退出且未被回收时，就会出现僵尸进程。这些进程徘徊并慢慢耗尽进程表中的资源。没有父进程的进程是孤立进程。

您是否在 Perl 脚本中分叉进程？如果是这样，您是否向父级添加了 waitpid() 调用？

您在脚本中是否也得到了正确的退出？

CORE::exit(0);

I just saw your comment that you are running Apache 1.3 and that may be associated with your problem.

SSI's can run CGI's. But CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.

As I'd suggested above, try running your scripts on their own and have a look at the output. Are they generating SSI's?

Edit: Have you tried launching a trivial Perl CGI script to simply printout a Hello World type HTTP response?

Then if this works add a trivial SSI directives such as

<!--#printenv -->

and see what happens.

Edit 2: Just realised what is probably happening. Zombies occur when a child process exits and isn't reaped. These processes are hanging around and slowly using up resources within the process table. A process without a parent is an orphaned process.

Are you forking off processes within your Perl script? If so, have you added a waitpid() call to the parent?

Have you also got the correct exit within the script?