与父母一起杀死孩子

发布于 2024-08-14 18:17:16 字数 901 浏览 9 评论 0原文

我有一个程序生成并与 CPU 繁重、不稳定的进程通信,这不是我创建的。如果我的应用程序崩溃或被 SIGKILL 杀死,我希望子进程也被杀死,这样用户就不必跟踪它们并手动杀死它们。

我知道这个主题以前已经讨论过,但我已经尝试了所描述的所有方法,但似乎没有一个能够经受住考验。

我知道这一定是可能的,因为终端一直在这样做。如果我在终端中运行某些东西,然后终止该终端,那么这些东西总是会死掉。

我尝试过atexit、double fork 和ptysatexit 不适用于 sigkill;双叉根本不起作用;和 ptys 我没有找到使用 python 的方法。

今天,我发现了 prctl(PR_SET_PDEATHSIG, SIGKILL),这应该是子进程在父进程死亡时命令自身终止的一种方式。 我尝试将它与 popen 一起使用,但它似乎根本没有任何效果:

import ctypes, subprocess
libc = ctypes.CDLL('/lib/libc.so.6')
PR_SET_PDEATHSIG = 1; TERM = 15
implant_bomb = lambda: libc.prctl(PR_SET_PDEATHSIG, TERM)
subprocess.Popen(['gnuchess'], preexec_fn=implant_bomb)

在上面,子级被创建,父级退出。现在,您可能期望 gnuchess 收到 SIGKILL 并死亡,但事实并非如此。我仍然可以在进程管理器中使用 100% CPU 找到它。

谁能告诉我我使用 prctl 是否有问题? 或者你知道终端是如何杀死它们的孩子的吗?

I have a program spawning and communicating with CPU heavy, unstable processes, not created by me. If my app crashes or is killed by SIGKILL, I want the subprocesses to get killed as well, so the user don´t have to track them down and kill them manually.

I know this topic has been covered before, but I have tried all methods described, and none of them seem to live up to survive the test.

I know it must be possible, since terminals do it all the time. If I run something in a terminal, and kill the terminal, the stuff always dies.

I have tried atexit, double fork and ptys. atexit doesn't work for sigkill; double fork doesn't work at all; and ptys I have found no way to work with using python.

Today, I found out about prctl(PR_SET_PDEATHSIG, SIGKILL), which should be a way for child processes to order a kill on themselves, when their parent dies.
I tried to use it with popen, but it seams to have no effect at all:

import ctypes, subprocess
libc = ctypes.CDLL('/lib/libc.so.6')
PR_SET_PDEATHSIG = 1; TERM = 15
implant_bomb = lambda: libc.prctl(PR_SET_PDEATHSIG, TERM)
subprocess.Popen(['gnuchess'], preexec_fn=implant_bomb)

In the above, the child is created and the parent exits. Now you would expect gnuchess to receive a SIGKILL and die, but it doesn't. I can still find it in my process manager using 100% CPU.

Can anybody tell me if there is something wrong with my use of prctl?,
or do you know how terminals manage to kill their children?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

Hello爱情风 2024-08-21 18:17:16

我知道已经过去很多年了,但我找到了解决这个问题的一个简单(有点老套)的方法。在您的父进程中,将所有调用包装在一个非常简单的 C 程序中,该程序调用 prctl(),然后调用 exec() 在 Linux 上解决了这个问题。我称之为“yeshup”:

#include <linux/prctl.h>
#include <signal.h>
#include <unistd.h>

int main(int argc, char **argv) {
     if(argc < 2)
          return 1;
     prctl(PR_SET_PDEATHSIG, SIGHUP, 0, 0, 0);
     return execvp(argv[1], &argv[1]);
}

当从 Python(或任何其他语言)生成子进程时,您可以运行“yeshup gnuchess [argments]”。您会发现,当父进程被终止时,所有子进程(应该)都会很好地发出 SIGHUP 信号。

这是有效的,因为即使在调用 execvp 之后,Linux 也会尊重对 prctl 的调用(不清除它)(这有效地将 yeshup 进程“转换”为 gnuchess 进程,或您在那里指定的任何命令),这与 fork() 不同。

I know it's been years, but I found a simple (slightly hacky) solution to this problem. From your parent process, wrapping all your calls around a very simple C program that calls prctl() and then exec() solves this problem on Linux. I call it "yeshup":

#include <linux/prctl.h>
#include <signal.h>
#include <unistd.h>

int main(int argc, char **argv) {
     if(argc < 2)
          return 1;
     prctl(PR_SET_PDEATHSIG, SIGHUP, 0, 0, 0);
     return execvp(argv[1], &argv[1]);
}

When spawning your child processes from Python (or any other language), you can run "yeshup gnuchess [argments]." You'll find that, when the parent process is killed, all your child processes (should) be given SIGHUP nicely.

This works because Linux will honor the call to prctl (not clear it) even after execvp is called (which effectively "transforms" the yeshup process into a gnuchess process, or whatever command you specify there), unlike fork().

咋地 2024-08-21 18:17:16

prctlPR_SET_DEATHSIG 只能设置这个正在调用 prctl 的进程——不适用于任何其他进程,包括该特定进程的子进程。我指向的手册页表达这一点的方式是“这个值在 fork() 上被清除”——fork,当然,是生成其他进程的方式(在 Linux 和任何其他进程中)其他 Unix-y 操作系统)。

如果您无法控制要在子进程中运行的代码(本质上,对于您的 gnuchess 示例来说就是这种情况),我建议您首先使用以下命令生成一个单独的小型“监视器”进程跟踪其所有兄弟进程(您的父进程可以让监视器在生成这些兄弟进程时了解这些兄弟进程的 pid),并在公共父进程死亡时向它们发送杀手信号(监视器需要轮询,唤醒每个兄弟进程) N 秒,用于检查父级是否还活着;使用 select 等待来自父级的更多信息,超时时间为 N 秒,循环内)。

这并非微不足道,但此类系统任务通常并非微不足道。终端的做法有所不同(通过进程组的“控制终端”的概念),但当然,任何子进程都可以轻松阻止它(双叉、nohup 等)。

prctl's PR_SET_DEATHSIG can only be set for this very process that's calling prctl -- not for any other process, including this specific process's children. The way the man page I'm pointing to expresses this is "This value is cleared upon a fork()" -- fork, of course, is the way other processes are spawned (in Linux and any other Unix-y OS).

If you have no control over the code you want to run in subprocesses (as would be the case, essentially, for your gnuchess example), I suggest you first spawn a separate small "monitor" process with the role of keeping track of all of its siblings (your parent process can let the monitor know about those siblings' pids as it spawns them) and sending them killer signals when the common parent dies (the monitor needs to poll for that, waking up every N seconds for some N of your choice to check if the parent's still alive; use select to wait for more info from the parent with a timeout of N seconds, within a loop).

Not trivial, but then such system tasks often aren't. Terminals do it differently (via the concept of a "controlling terminal" for a process group) but of course it's trivial for any child to block THAT off (double forks, nohup, and so on).

你又不是我 2024-08-21 18:17:16

实际上,我发现您原来的方法对我来说效果很好 - 这是我测试过的确切示例代码:

echoer.py

#!/bin/env python

import time
import sys
i = 0
try:
    while True:
        i += 1
        print i
        time.sleep(1)
except KeyboardInterrupt:
    print "\nechoer caught KeyboardInterrupt"
    exit(0)

parentProc.py

#!/bin/env python

import ctypes
import subprocess
import time

libc = ctypes.CDLL('/lib64/libc.so.6')
PR_SET_PDEATHSIG = 1
SIGINT = 2
SIGTERM = 15

def set_death_signal(signal):
    libc.prctl(PR_SET_PDEATHSIG, signal)

def set_death_signal_int():
    set_death_signal(SIGINT)

def set_death_signal_term():
    set_death_signal(SIGTERM)

#subprocess.Popen(['./echoer.py'], preexec_fn=set_death_signal_term)
subprocess.Popen(['./echoer.py'], preexec_fn=set_death_signal_int)
time.sleep(1.5)
print "parentProc exiting..."

Actually I found that your original approach worked just fine for me - here's the exact example code I tested with which worked:

echoer.py

#!/bin/env python

import time
import sys
i = 0
try:
    while True:
        i += 1
        print i
        time.sleep(1)
except KeyboardInterrupt:
    print "\nechoer caught KeyboardInterrupt"
    exit(0)

parentProc.py

#!/bin/env python

import ctypes
import subprocess
import time

libc = ctypes.CDLL('/lib64/libc.so.6')
PR_SET_PDEATHSIG = 1
SIGINT = 2
SIGTERM = 15

def set_death_signal(signal):
    libc.prctl(PR_SET_PDEATHSIG, signal)

def set_death_signal_int():
    set_death_signal(SIGINT)

def set_death_signal_term():
    set_death_signal(SIGTERM)

#subprocess.Popen(['./echoer.py'], preexec_fn=set_death_signal_term)
subprocess.Popen(['./echoer.py'], preexec_fn=set_death_signal_int)
time.sleep(1.5)
print "parentProc exiting..."
仙女山的月亮 2024-08-21 18:17:16

我以为双叉是从控制终端上分离出来的。我不确定你是如何尝试使用它的。

这是一种黑客行为,但您始终可以调用“ps”并搜索您试图杀死的进程名称。

I thought the double fork was to detach from a controlling terminal. I'm not sure how you are trying to use it.

It's a hack, but you could always call 'ps' and search for the process name your trying to kill.

风为裳 2024-08-21 18:17:16

我见过使用诸如 ps xuawww | 这样的东西来“清理”的非常令人讨厌的方法。 grep 我的应用程序 | awk '{ 打印 $1}' | xargs -n1 Kill -9

客户端进程如果弹出,可以捕获 SIG_PIPE 并终止。有很多方法可以解决这个问题,但这实际上取决于很多因素。如果你在子进程中抛出一些 ping 代码(ping 到父进程),你可以确保在死亡时发出 SIG_PIPE。如果它捕获了它,它应该捕获它,它就会终止。您需要双向通信才能正常工作......或者始终阻止客户端作为通信的发起者。如果您不想修改子项,请忽略此操作。

假设您不希望实际的 Python 解释器出现段错误,您可以将每个 PID 添加到一个序列中,然后在退出时终止。这对于退出甚至未捕获的异常来说应该是安全的。 Python 有执行退出代码的工具......以进行清理。

这里有一些更安全的讨厌的方法:将每个子 PID 附加到一个文件,包括您的主进程(单独的文件)。使用文件锁定。构建一个看门狗守护进程,用于查看主 pid 的集群()状态。如果没有锁定,则杀死子 PID 列表中的每个 PID。在启动时运行相同的代码。

更糟糕的是:将 PID 写入文件,如上所述,然后在子 shell 中调用您的应用程序:(./myMaster; ./killMyChildren)

I've seen very nasty ways of "clean-up" using things like ps xuawww | grep myApp | awk '{ print $1}' | xargs -n1 kill -9

The client process, if popened, can catch SIG_PIPE and die. There are many ways to go about this, but it really depends on a lot of factors. If you throw some ping code (ping to parent) in the child, you can ensure that a SIG_PIPE is issued on death. If it catches it, which it should, it'll terminate. You'd need bidirectional communication for this to work correctly... or to always block against the client as the originator of communication. If you don't want to modify the child, ignore this.

Assuming that you don't expect the actual Python interpreter to segfault, you could add each PID to a sequence, and then kill on exit. This should be safe for exiting and even uncaught exceptions. Python has facilities to perform exit code... for clean-up.

Here's some safer nasty: Append each child PID to a file, including your master process (separate file). Use file locking. Build a watchdog daemon that looks at the flock() state of your master pid. If it's not locked, kill every PID in your child PID list. Run the same code on startup.

More nasty: Write the PIDs to files, as above, then invoke your app in a sub-shell: (./myMaster; ./killMyChildren)

倚栏听风 2024-08-21 18:17:16

我想知道 PR_SET_PDEATHSIG 标志是否被清除,即使您在 fork 之后(以及 exec 之前)设置了它,所以看起来从类似的文档中,它不应该被清除。

为了测试该理论,您可以尝试以下操作:使用相同的代码运行用 C 编写的子进程,基本上只调用 prctl(PR_GET_PDEATHSIG, &result) 并打印结果。

您可以尝试的另一件事是:在调用 prctl 时为 arg3、arg4 和 arg5 添加显式零。 IE:

>>> implant_bomb = lambda: libc.prctl(PR_SET_PDEATHSIG, TERM, 0, 0, 0)

I'm wondering if the PR_SET_PDEATHSIG flag is getting cleared, even though you set it after you fork (and before exec), so it seems from the docs like it shouldn't get cleared.

In order to test that theory, you could try the following: use the same code to run a subprocess that's written in C and basically just calls prctl(PR_GET_PDEATHSIG, &result) and prints the result.

Another thing you might try: adding explicit zeros for arg3, arg4, and arg5 when you call prctl. I.e.:

>>> implant_bomb = lambda: libc.prctl(PR_SET_PDEATHSIG, TERM, 0, 0, 0)

有一些安全限制需要考虑,因为如果我们在 execv 之后调用 setuid,他的孩子就无法接收信号。此限制的完整列表位于此处

祝你好运!
/穆罕默德

There is some security restriction to take into account because if we call setuid after execv he child cannot receive signal. The complete list of this restrictions is here

good luck !
/Mohamed

很糊涂小朋友 2024-08-21 18:17:16

其他答案提到了 prctlPR_SET_DEATHSIG 但忽略了这样一个事实,即可以使用

setpriv --pdeathsig HUP [command] &

Other answers mention prctl's PR_SET_DEATHSIG but leave out the fact that this can be set from the command line using the setpriv command:

setpriv --pdeathsig HUP [command] &
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文