当 Perl 中发生警报时，我应该如何清理挂起的孙进程？

发布于 2024-09-01 07:28:00 字数 2363 浏览 7 评论 0原文

我有一个并行自动化脚本，需要调用许多其他脚本，其中一些脚本挂起，因为它们（错误地）等待标准输入或等待各种其他不会发生的事情。这不是什么大问题，因为我用 alarm 捕获了那些。诀窍是在子进程关闭时关闭那些挂起的孙进程。我认为 SIGCHLD、等待和进程组的各种咒语都可以达到目的，但它们都会阻塞，并且孙子不会被收获。

我的解决方案虽然有效，但似乎不是正确的解决方案。我目前对 Windows 解决方案还不是特别感兴趣，但我最终也会需要它。我的只适用于 Unix，目前还好。

我编写了一个小脚本，它需要运行同时并行子进程的数量和分叉的总数：

 $ fork_bomb <parallel jobs> <number of forks>

 $ fork_bomb 8 500

这可能会在几分钟内达到每个用户进程的限制。我发现的许多解决方案只是告诉您增加每个用户的进程限制，但我需要它运行大约 300,000 次，所以这是行不通的。同样，重新执行等清除进程表的建议也不是我所需要的。我想真正解决问题，而不是用胶带包裹它。

我爬行进程表查找子进程，并在 SIGALRM 处理程序中单独关闭挂起的进程，该处理程序需要终止，因为此后其余的实际代码没有成功的希望。从性能角度来看，在进程表中进行的混乱爬行并不会打扰我，但我不介意不这样做：

use Parallel::ForkManager;
use Proc::ProcessTable;

my $pm = Parallel::ForkManager->new( $ARGV[0] );

my $alarm_sub = sub {
        kill 9,
            map  { $_->{pid} }
            grep { $_->{ppid} == $$ }
            @{ Proc::ProcessTable->new->table }; 

        die "Alarm rang for $$!\n";
        };

foreach ( 0 .. $ARGV[1] ) 
    {
    print ".";
    print "\n" unless $count++ % 50;

    my $pid = $pm->start and next; 

    local $SIG{ALRM} = $alarm_sub;

    eval {
        alarm( 2 );
        system "$^X -le '<STDIN>'"; # this will hang
        alarm( 0 );
        };

    $pm->finish;
    }

如果您想用完进程，请取出杀死。

我认为设置一个进程组会起作用，这样我就可以一起杀死所有东西，但这会阻止：

my $alarm_sub = sub {
        kill 9, -$$;    # blocks here
        die "Alarm rang for $$!\n";
        };

foreach ( 0 .. $ARGV[1] ) 
    {
    print ".";
    print "\n" unless $count++ % 50;

    my $pid = $pm->start and next; 
    setpgrp(0, 0);

    local $SIG{ALRM} = $alarm_sub;

    eval {
        alarm( 2 );
        system "$^X -le '<STDIN>'"; # this will hang
        alarm( 0 );
        };

    $pm->finish;
    }

与相同的事情POSIX 的 setsid 也不起作用，我认为这实际上以不同的方式破坏了事情，因为我并没有真正将其守护化。

奇怪的是， Parallel::ForkManager 的 run_on_finish 也会发生相同的清理代码迟到了：此时孙子进程显然已经与子进程解除关联。

原文

I have a parallelized automation script which needs to call many other scripts, some of which hang because they (incorrectly) wait for standard input or wait around for various other things that aren't going to happen. That's not a big deal because I catch those with alarm. The trick is to shut down those hung grandchild processes when the child shuts down. I thought various incantations of SIGCHLD, waiting, and process groups could do the trick, but they all block and the grandchildren aren't reaped.

My solution, which works, just doesn't seem like it is the right solution. I'm not especially interested in the Windows solution just yet, but I'll eventually need that too. Mine only works for Unix, which is fine for now.

I wrote a small script that takes the number of simultaneous parallel children to run and the total number of forks:

 $ fork_bomb <parallel jobs> <number of forks>

 $ fork_bomb 8 500

This will probably hit the per-user process limit within a couple of minutes. Many solutions I've found just tell you to increase the per-user process limit, but I need this to run about 300,000 times, so that isn't going to work. Similarly, suggestions to re-exec and so on to clear the process table aren't what I need. I'd like to actually fix the problem instead of slapping duct tape over it.

I crawl the process table looking for the child processes and shut down the hung processes individually in the SIGALRM handler, which needs to die because the rest of real code has no hope of success after that. The kludgey crawl through the process table doesn't bother me from a performance perspective, but I wouldn't mind not doing it:

use Parallel::ForkManager;
use Proc::ProcessTable;

my $pm = Parallel::ForkManager->new( $ARGV[0] );

my $alarm_sub = sub {
        kill 9,
            map  { $_->{pid} }
            grep { $_->{ppid} == $ }
            @{ Proc::ProcessTable->new->table }; 

        die "Alarm rang for $!\n";
        };

foreach ( 0 .. $ARGV[1] ) 
    {
    print ".";
    print "\n" unless $count++ % 50;

    my $pid = $pm->start and next; 

    local $SIG{ALRM} = $alarm_sub;

    eval {
        alarm( 2 );
        system "$^X -le '<STDIN>'"; # this will hang
        alarm( 0 );
        };

    $pm->finish;
    }

If you want to run out of processes, take out the kill.

I thought that setting a process group would work so I could kill everything together, but that blocks:

my $alarm_sub = sub {
        kill 9, -$;    # blocks here
        die "Alarm rang for $!\n";
        };

foreach ( 0 .. $ARGV[1] ) 
    {
    print ".";
    print "\n" unless $count++ % 50;

    my $pid = $pm->start and next; 
    setpgrp(0, 0);

    local $SIG{ALRM} = $alarm_sub;

    eval {
        alarm( 2 );
        system "$^X -le '<STDIN>'"; # this will hang
        alarm( 0 );
        };

    $pm->finish;
    }

The same thing with POSIX's setsid didn't work either, and I think that actually broke things in a different way since I'm not really daemonizing this.

Curiously, Parallel::ForkManager's run_on_finish happens too late for the same clean-up code: the grandchildren are apparently already disassociated from the child processes at that point.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

时光病人 2024-09-08 07:28:00

我已经读过这个问题好几次了，我想我已经明白你的意思了
正在努力做。你有一个控制脚本。这个脚本产生
孩子们去做一些事情，这些孩子会产生孙子
真正做这项工作。问题是孙子可以
太慢了（等待 STDIN，或者其他什么），你想杀死它们。
此外，如果有一个缓慢的孙子，您想要整个
孩子死（如果可能的话，杀死其他孙子）。

所以，我尝试通过两种方式实现这一点。第一个是让
父进程在一个新的 UNIX 会话中生成一个子进程，为几个进程设置一个计时器
秒，并在计时器关闭时终止整个子会话。
这使得父母既要对孩子负责，也要对孩子负责。
孙子们。它也没有正常工作。

下一个策略是让父进程生成子进程，然后
让孩子负责管理孙子。它会
为每个孙子设置一个计时器，如果进程没有设置则终止它
过期时间退出。这很好用，所以这是代码。

我们将使用 EV 来管理子项和计时器，并使用 AnyEvent 来管理
API。（您可以尝试另一个 AnyEvent 事件循环，例如 Event 或 POE。
但我知道 EV 正确处理了孩子退出的情况
在你告诉循环监视它之前，这消除了恼人的竞争
其他循环容易受到这种情况的影响。）

#!/usr/bin/env perl

use strict;
use warnings;
use feature ':5.10';

use AnyEvent;
use EV; # you need EV for the best child-handling abilities

我们需要跟踪子观察者：

# active child watchers
my %children;

然后我们需要编写一个函数来启动子观察者。那些事
父母生成的东西称为孩子，而孩子的东西
产生的称为工作。

sub start_child($@) {
    my ($on_success, $on_error, @jobs) = @_;

参数是子进程完成时调用的回调
成功（意味着它的工作也成功），当
孩子没有成功完成，然后是一个coderef列表
要运行的作业。

在这个函数中，我们需要fork。在父级中，我们设置一个子级
监视子进程的观察者：

    if(my $pid = fork){ # parent
        # monitor the child process, inform our callback of error or success
        say "$: Starting child process $pid";
        $children{$pid} = AnyEvent->child( pid => $pid, cb => sub {
            my ($pid, $status) = @_;
            delete $children{$pid};

            say "$: Child $pid exited with status $status";
            if($status == 0){
                $on_success->($pid);
            }
            else {
                $on_error->($pid);
            }
        });
    }

在子进程中，我们实际上运行作业。这涉及到一点点
不过，设置。

首先，我们忘记了父母的孩子观察者，因为它不会使
让孩子知道其兄弟姐妹退出的情况。（叉子是
有趣，因为你继承了父级的所有状态，即使
完全没有意义。）

    else { # child
        # kill the inherited child watchers
        %children = ();
        my %timers;

我们还需要知道所有工作何时完成，以及是否完成
他们都很成功。我们使用计数条件变量来
确定一切都已退出的时间。我们在启动时递增，并且
退出时递减，当计数为 0 时，我们知道一切都已完成。

我还保留一个布尔值来指示错误状态。如果一个进程
以非零状态退出，错误变为 1。否则，它保持 0。
您可能想要保留比这更多的状态:)

        # then start the kids
        my $done = AnyEvent->condvar;
        my $error = 0;

        $done->begin;

（我们也从 1 开始计数，这样如果有 0 个作业，我们的流程
仍然退出。）

现在我们需要为每个作业分叉并运行该作业。在父辈中，我们
做几件事。我们增加条件变量。我们设置一个计时器来杀死
孩子如果太慢了。我们设置了一个儿童观察者，这样我们就可以
了解作业的退出状态。

    for my $job (@jobs) {
            if(my $pid = fork){
                say "[c] $: starting job $job in $pid";
                $done->begin;

                # this is the timer that will kill the slow children
                $timers{$pid} = AnyEvent->timer( after => 3, interval => 0, cb => sub {
                    delete $timers{$pid};

                    say "[c] $: Killing $pid: too slow";
                    kill 9, $pid;
                });

                # this monitors the children and cancels the timer if
                # it exits soon enough
                $children{$pid} = AnyEvent->child( pid => $pid, cb => sub {
                    my ($pid, $status) = @_;
                    delete $timers{$pid};
                    delete $children{$pid};

                    say "[c] [j] $: job $pid exited with status $status";
                    $error ||= ($status != 0);
                    $done->end;
                });
            }

使用定时器比闹钟更容易一些，因为它带有
与它声明。每个计时器都知道要杀死哪个进程，这很容易
当进程成功退出时取消计时器——我们只是
从哈希中删除它。

那是（孩子的）父母。孩子（孩子的；或
job）非常简单：

            else {
                # run kid
                $job->();
                exit 0; # just in case
            }

如果您愿意，您也可以在此处关闭标准输入。

现在，在所有进程都生成之后，我们等待它们
全部通过等待 condvar 退出。事件循环将监视
孩子和计时器，并为我们做正确的事情：

        } # this is the end of the for @jobs loop
        $done->end;

        # block until all children have exited
        $done->recv;

然后，当所有孩子都退出时，我们可以做任何清理
我们想要的工作，例如：

        if($error){
            say "[c] $: One of your children died.";
            exit 1;
        }
        else {
            say "[c] $: All jobs completed successfully.";
            exit 0;
        }
    } # end of "else { # child"
} # end of start_child

好的，这就是子孙/工作。现在我们只需要写
父母，这要容易得多。

像孩子一样，我们将使用计数 condvar 来等待我们的
孩子们。

# main program
my $all_done = AnyEvent->condvar;

我们需要做一些工作。这是一个总是成功的，并且
如果你按下回车键就会成功，但如果你按下回车键就会失败
只需让它被计时器杀死：

my $good_grandchild = sub {
    exit 0;
};

my $bad_grandchild = sub {
    my $line = <STDIN>;
    exit 0;
};

那么我们只需要启动子作业即可。如果你还记得方法
回到start_child顶部，需要两次回调，一个错误
回调，以及成功回调。我们将进行设置；错误
回调将打印“not ok”并递减 condvar，并且
成功回调将打印“ok”并执行相同的操作。很简单。

my $ok  = sub { $all_done->end; say "$: $_[0] ok" };
my $nok = sub { $all_done->end; say "$: $_[0] not ok" };

然后我们就可以开始生一群有更多孙子的孩子
jobs：

say "starting...";

$all_done->begin for 1..4;
start_child $ok, $nok, ($good_grandchild, $good_grandchild, $good_grandchild);
start_child $ok, $nok, ($good_grandchild, $good_grandchild, $bad_grandchild);
start_child $ok, $nok, ($bad_grandchild, $bad_grandchild, $bad_grandchild);
start_child $ok, $nok, ($good_grandchild, $good_grandchild, $good_grandchild, $good_grandchild);

其中两个会超时，两个会成功。如果你按回车键
不过，当他们在跑步时，他们可能都会成功。

不管怎样，一旦这些开始了，我们只需要等待他们
完成：

$all_done->recv;

say "...done";

exit 0;

这就是程序。

我们没有做 Parallel::ForkManager 所做的一件事是
“速率限制”我们的分叉，以便只有 n 个子进程以某个速度运行
时间。不过，这很容易手动实现：

 use Coro;
 use AnyEvent::Subprocess; # better abstraction than manually
                           # forking and making watchers
 use Coro::Semaphore;

 my $job = AnyEvent::Subprocess->new(
    on_completion => sub {}, # replace later
    code          => sub { the child process };
 )

 my $rate_limit = Coro::Semaphore->new(3); # 3 procs at a time

 my @coros = map { async {
     my $guard = $rate_limit->guard;
     $job->clone( on_completion => Coro::rouse_cb )->run($_);
     Coro::rouse_wait;
 }} ({ args => 'for first job' }, { args => 'for second job' }, ... );

 # this waits for all jobs to complete
 my @results = map { $_->join } @coros;

这样做的好处是，您可以在孩子的同时做其他事情
正在运行 - 在执行之前只需使用 async 生成更多线程
阻止加入。您对孩子也有更多的控制权
使用 AnyEvent::Subprocess - 您可以在 Pty 中运行子进程并提供
它是标准输入（与 Expect 一样），您可以捕获它的标准输入和标准输出
和 stderr，或者你可以忽略这些东西，或者其他什么。你可以到
决定，而不是某个试图让事情变得“简单”的模块作者。

无论如何，希望这会有所帮助。

I've read the question a few times, and I think I sort of get what you
are trying to do. You have a control script. This script spawns
children to do some stuff, and these children spawn the grandchildren
to actually do the work. The problem is that the grandchildren can be
too slow (waiting for STDIN, or whatever), and you want to kill them.
Furthermore, if there is one slow grandchild, you want the entire
child to die (killing the other grandchildren, if possible).

So, I tried implementing this two ways. The first was to make the
parent spawn a child in a new UNIX session, set a timer for a few
seconds, and kill the entire child session when the timer went off.
This made the parent responsible for both the child and the
grandchildren. It also didn't work right.

The next strategy was to make the parent spawn the child, and then
make the child responsible for managing the grandchildren. It would
set a timer for each grandchild, and kill it if the process hadn't
exited by expiration time. This works great, so here is the code.

We'll use EV to manage the children and timers, and AnyEvent for the
API. (You can try another AnyEvent event loop, like Event or POE.
But I know that EV correctly handles the condition where a child exits
before you tell the loop to monitor it, which eliminates annoying race
conditions that other loops are vulnerable to.)

#!/usr/bin/env perl

use strict;
use warnings;
use feature ':5.10';

use AnyEvent;
use EV; # you need EV for the best child-handling abilities

We need to keep track of the child watchers:

# active child watchers
my %children;

Then we need to write a function to start the children. The things
the parent spawns are called children, and the things the children
spawn are called jobs.

sub start_child($@) {
    my ($on_success, $on_error, @jobs) = @_;

The arguments are a callback to be called when the child completes
successfully (meaning its jobs were also a success), a callback when
the child did not complete successfully, and then a list of coderef
jobs to run.

In this function, we need to fork. In the parent, we setup a child
watcher to monitor the child:

    if(my $pid = fork){ # parent
        # monitor the child process, inform our callback of error or success
        say "$: Starting child process $pid";
        $children{$pid} = AnyEvent->child( pid => $pid, cb => sub {
            my ($pid, $status) = @_;
            delete $children{$pid};

            say "$: Child $pid exited with status $status";
            if($status == 0){
                $on_success->($pid);
            }
            else {
                $on_error->($pid);
            }
        });
    }

In the child, we actually run the jobs. This involves a little bit of
setup, though.

First, we forget the parent's child watchers, because it doesn't make
sense for the child to be informed of its siblings exiting. (Fork is
fun, because you inherit all of the parent's state, even when that
makes no sense at all.)

    else { # child
        # kill the inherited child watchers
        %children = ();
        my %timers;

We also need to know when all the jobs are done, and whether or not
they were all a success. We use a counting conditional variable to
determine when everything has exited. We increment on startup, and
decrement on exit, and when the count is 0, we know everything's done.

I also keep a boolean around to indicate error state. If a process
exits with a non-zero status, error goes to 1. Otherwise, it stays 0.
You might want to keep more state than this :)

        # then start the kids
        my $done = AnyEvent->condvar;
        my $error = 0;

        $done->begin;

(We also start the count at 1 so that if there are 0 jobs, our process
still exits.)

Now we need to fork for each job, and run the job. In the parent, we
do a few things. We increment the condvar. We set a timer to kill
the child if it's too slow. And we setup a child watcher, so we can
be informed of the job's exit status.

    for my $job (@jobs) {
            if(my $pid = fork){
                say "[c] $: starting job $job in $pid";
                $done->begin;

                # this is the timer that will kill the slow children
                $timers{$pid} = AnyEvent->timer( after => 3, interval => 0, cb => sub {
                    delete $timers{$pid};

                    say "[c] $: Killing $pid: too slow";
                    kill 9, $pid;
                });

                # this monitors the children and cancels the timer if
                # it exits soon enough
                $children{$pid} = AnyEvent->child( pid => $pid, cb => sub {
                    my ($pid, $status) = @_;
                    delete $timers{$pid};
                    delete $children{$pid};

                    say "[c] [j] $: job $pid exited with status $status";
                    $error ||= ($status != 0);
                    $done->end;
                });
            }

Using the timer is a little bit easier than alarm, since it carries
state with it. Each timer knows which process to kill, and it's easy
to cancel the timer when the process exits successfully -- we just
delete it from the hash.

That's the parent (of the child). The child (of the child; or the
job) is really simple:

            else {
                # run kid
                $job->();
                exit 0; # just in case
            }

You could also close stdin here, if you wanted to.

Now, after all the processes have been spawned, we wait for them to
all exit by waiting on the condvar. The event loop will monior the
children and timers, and do the right thing for us:

        } # this is the end of the for @jobs loop
        $done->end;

        # block until all children have exited
        $done->recv;

Then, when all the children have exited, we can do whatever cleanup
work we want, like:

        if($error){
            say "[c] $: One of your children died.";
            exit 1;
        }
        else {
            say "[c] $: All jobs completed successfully.";
            exit 0;
        }
    } # end of "else { # child"
} # end of start_child

OK, so that's the child and grandchild/job. Now we just need to write
the parent, which is a lot easier.

Like the child, we are going to use a counting condvar to wait for our
children.

# main program
my $all_done = AnyEvent->condvar;

We need some jobs to do. Here's one that is always successful, and
one that will be successful if you press return, but will fail if you
just let it be killed by the timer:

my $good_grandchild = sub {
    exit 0;
};

my $bad_grandchild = sub {
    my $line = <STDIN>;
    exit 0;
};

So then we just need to start the child jobs. If you remember way
back to the top of start_child, it takes two callbacks, an error
callback, and a success callback. We'll set those up; the error
callback will print "not ok" and decrement the condvar, and the
success callback will print "ok" and do the same. Very simple.

my $ok  = sub { $all_done->end; say "$: $_[0] ok" };
my $nok = sub { $all_done->end; say "$: $_[0] not ok" };

Then we can start a bunch of children with even more grandchildren
jobs:

say "starting...";

$all_done->begin for 1..4;
start_child $ok, $nok, ($good_grandchild, $good_grandchild, $good_grandchild);
start_child $ok, $nok, ($good_grandchild, $good_grandchild, $bad_grandchild);
start_child $ok, $nok, ($bad_grandchild, $bad_grandchild, $bad_grandchild);
start_child $ok, $nok, ($good_grandchild, $good_grandchild, $good_grandchild, $good_grandchild);

Two of those will timeout, and two will succeed. If you press enter
while they're running, though, then they might all succeed.

Anyway, once those have started, we just need to wait for them to
finish:

$all_done->recv;

say "...done";

exit 0;

And that's the program.

One thing that we aren't doing that Parallel::ForkManager does is
"rate limiting" our forks so that only n children are running at a
time. This is pretty easy to manually implement, though:

 use Coro;
 use AnyEvent::Subprocess; # better abstraction than manually
                           # forking and making watchers
 use Coro::Semaphore;

 my $job = AnyEvent::Subprocess->new(
    on_completion => sub {}, # replace later
    code          => sub { the child process };
 )

 my $rate_limit = Coro::Semaphore->new(3); # 3 procs at a time

 my @coros = map { async {
     my $guard = $rate_limit->guard;
     $job->clone( on_completion => Coro::rouse_cb )->run($_);
     Coro::rouse_wait;
 }} ({ args => 'for first job' }, { args => 'for second job' }, ... );

 # this waits for all jobs to complete
 my @results = map { $_->join } @coros;

The advantage here is that you can do other things while your children
are running -- just spawn more threads with async before you do the
blocking join. You also have a lot more control over the children
with AnyEvent::Subprocess -- you can run the child in a Pty and feed
it stdin (like with Expect), and you can capture its stdin and stdout
and stderr, or you can ignore those things, or whatever. You get to
decide, not some module author that's trying to make things "simple".

Anyway, hope this helps.

回复收藏 0 原文

晨与橙与城 2024-09-08 07:28:00

Brian - 这有点粗鲁和不惯用，但我见过的一种方法是这样的：每当你 fork 时，你：

给子进程一个程序的第一个“-id”虚拟参数，有点像唯一（每个 PID）值 - 一个好的候选值可能是毫秒时间戳 + 父进程的 PID。
父进程将子进程 PID 和 -id 值以及所需的超时/终止时间记录到（理想情况下，持久的）注册表中。

然后让一个观察者进程（无论是最终的祖父母进程还是具有相同 UID 的单独进程）只需定期循环访问注册表，并检查哪些需要被杀死的进程（根据终止时间）仍然在附近（通过匹配注册表中的 PID 和“-id”参数值以及进程表中的 PID 和命令行）；并将信号 9 发送到此类进程（或者友善地尝试先尝试发送信号 2 来轻轻杀死）。

独特的“-id”参数显然是为了防止杀死一些无辜的进程，这些进程恰好重用了先前进程的 PID，这可能是考虑到您提到的规模。

注册表的想法有助于解决“已经脱离关联”的孙子的问题，因为您不再依赖系统来为您保留父/子关联。

这是一种蛮力，但既然还没人回答，我想我会用你的方式思考我的 3 美分的想法。

回复收藏 0 原文

年华零落成诗 2024-09-08 07:28:00

我必须在模块我一直在努力。我对我的所有解决方案也不完全满意，但在 Unix 上通常有效的是

更改子进程组
根据需要
生成孙子进程再次更改子进程组（例如，回到其原始值）
发出信号孙子的进程组杀死孙子

类似：

use Time::HiRes qw(sleep);

sub be_sleepy { sleep 2 ** (5 * rand()) }
$SIGINT = 2;

for (0 .. $ARGV[1]) {
    print ".";
    print "\n" unless ++$count % 50;
    if (fork() == 0) {   
        # a child process
        # $ORIGINAL_PGRP and $NEW_PGRP should be global or package or object level vars
        $ORIGINAL_PGRP = getpgrp(0);
        setpgrp(0, $);
        $NEW_PGRP = getpgrp(0);

        local $SIG{ALRM} = sub {
            kill_grandchildren();
            die "$ timed out\n";
        };

        eval {
            alarm 2;
            while (rand() < 0.5) {
                if (fork() == 0) {
                    be_sleepy();
                }
            }
            be_sleepy();
            alarm 0;
            kill_grandchildren();
        };

        exit 0;
    }
}

sub kill_grandchildren {
    setpgrp(0, $ORIGINAL_PGRP);
    kill -$SIGINT, $NEW_PGRP;   # or  kill $SIGINT, -$NEW_PGRP
}

这并不是完全万无一失的。孙子可能会更改他们的进程组或陷阱信号。

当然，这些都不适用于 Windows，但我们只能说 TASKKILL /F /T 是您的朋友。

更新：此解决方案不处理（无论如何对我来说）子进程调用system“perl -le ''”时的情况。对我来说，这会立即挂起进程，并阻止 SIGALRM 触发和 SIGALRM 处理程序运行。关闭 STDIN 是唯一的解决方法吗？

I have to solve this same problem in a module I've been working on. I'm not completely satisfied with all of my solution(s) either, but what generally works on Unix is to

change a child's process group
spawn grandchildren as necessary
change the child's process group again (say, back to its original value)
signal the grandchildren's process group to kill the grandchildren

Something like:

use Time::HiRes qw(sleep);

sub be_sleepy { sleep 2 ** (5 * rand()) }
$SIGINT = 2;

for (0 .. $ARGV[1]) {
    print ".";
    print "\n" unless ++$count % 50;
    if (fork() == 0) {   
        # a child process
        # $ORIGINAL_PGRP and $NEW_PGRP should be global or package or object level vars
        $ORIGINAL_PGRP = getpgrp(0);
        setpgrp(0, $);
        $NEW_PGRP = getpgrp(0);

        local $SIG{ALRM} = sub {
            kill_grandchildren();
            die "$ timed out\n";
        };

        eval {
            alarm 2;
            while (rand() < 0.5) {
                if (fork() == 0) {
                    be_sleepy();
                }
            }
            be_sleepy();
            alarm 0;
            kill_grandchildren();
        };

        exit 0;
    }
}

sub kill_grandchildren {
    setpgrp(0, $ORIGINAL_PGRP);
    kill -$SIGINT, $NEW_PGRP;   # or  kill $SIGINT, -$NEW_PGRP
}

This isn't completely fool proof. The grandchildren might change their process groups or trap signals.

None of this will work on Windows, of course, but let's just say that TASKKILL /F /T is your friend.

Update: This solution doesn't handle (for me, anyway) the case when the child process invokes system "perl -le '<STDIN>'". For me, this immediately suspends the process, and prevents the SIGALRM from firing and the SIGALRM handler from running. Is closing STDIN the only workaround?

回复收藏 0 原文

~没有更多了~