守护进程内的Python多处理池

发布于 2024-11-17 18:44:54 字数 3278 浏览 7 评论 0原文

我针对这个问题提出了一个问题，但没有得到足够彻底的答案来解决该问题（很可能是由于在解释我的问题时缺乏严谨性，这正是我试图纠正的问题）： python 多处理守护进程中的僵尸进程

我正在尝试实现一个使用工作池来执行命令的 python 守护进程使用打开。我从 http://www.jejik.com/articles/ 借用了基本守护进程2007/02/a_simple_unix_linux_daemon_in_python/

我只改变了init、daemonize（或同等的start）和stop方法。以下是对 init 方法的更改：

def __init__(self, pidfile):
#, stdin='/dev/null', stdout='STDOUT', stderr='STDOUT'):
    #self.stdin = stdin
    #self.stdout = stdout
    #self.stderr = stderr
    self.pidfile = pidfile
    self.pool = Pool(processes=4)

我没有设置 stdin、stdout 和 stderr，以便可以使用 print 语句调试代码。另外，我尝试将此池移动到几个地方，但这是唯一不会产生异常的地方。

以下是对 daemonize 方法的更改：

def daemonize(self):
    ...

    # redirect standard file descriptors
    #sys.stdout.flush()
    #sys.stderr.flush()
    #si = open(self.stdin, 'r')
    #so = open(self.stdout, 'a+')
    #se = open(self.stderr, 'a+', 0)
    #os.dup2(si.fileno(), sys.stdin.fileno())
    #os.dup2(so.fileno(), sys.stdout.fileno())
    #os.dup2(se.fileno(), sys.stderr.fileno())

    print self.pool

    ...

同样，我没有重定向 io 以便进行调试。使用此处的打印以便我可以检查泳池位置。

并且 stop 方法发生了变化：

def stop(self):
    ...

    # Try killing the daemon process
    try:
        print self.pool
        print "closing pool"
        self.pool.close()
        print "joining pool"
        self.pool.join()
        print "set pool to None"
        self.pool = None
        while 1:
            print "kill process"
            os.kill(pid, SIGTERM)

    ...

这里的想法是，我不仅需要终止进程，还需要清理池。 self.pool = None 只是随机尝试解决不起作用的问题。起初，我认为这是僵尸儿童的问题，当我在 while 内使用 self.pool.close() 和 self.pool.join() 时，就会发生这种情况使用 os.kill(pid, SIGTERM) 循环。这是在我决定开始通过 print self.pool 查看池位置之前。执行此操作后，我相信守护进程启动和停止时的池不一样。这是一些输出：

me@pc:~/pyCode/jobQueue$ sudo ./jobQueue.py start
<multiprocessing.pool.Pool object at 0x1c543d0>
me@pc:~/pyCode/jobQueue$ sudo ./jobQueue.py stop
<multiprocessing.pool.Pool object at 0x1fb7450>
closing pool
joining pool
set pool to None
kill process
kill process
... [ stuck in infinite loop]

对象的不同位置向我表明它们不是同一个池，其中之一可能是僵尸？

在 CTRL+C 之后，这是我从 ps aux|grep jobQueue 得到的结果：

root     21161  0.0  0.0  50384  5220 ?        Ss   22:59   0:00 /usr/bin/python ./jobQueue.py start
root     21162  0.0  0.0      0     0 ?        Z    22:59   0:00 [jobQueue.py] <defunct>
me       21320  0.0  0.0   7624   940 pts/0    S+   23:00   0:00 grep --color=auto jobQueue

我尝试移动 self.pool = Pool(processes=4)< /code> 到许多不同的地方。如果它被移动到 start()' 或daemonize()方法，print self.pool` 将抛出一个异常，指出它是 NoneType。另外，位置似乎改变了将弹出的僵尸进程的数量。

目前，我还没有添加通过工作人员运行任何内容的功能。我的问题似乎完全与正确设置工作人员池有关。我将不胜感激任何有助于解决此问题的信息或有关创建守护程序服务的建议，该守护程序服务使用工作池来使用 Popen 执行一系列命令。由于我还没有走到这一步，我不知道我未来会面临什么挑战。我想我可能只需要编写自己的池，但如果有一个很好的技巧可以让池在这里工作，那就太棒了。

原文

I opened up a question for this problem and did not get a thorough enough answer to solve the issue (most likely due to a lack of rigor in explaining my issues which is what I am attempting to correct): Zombie process in python multiprocessing daemon

I am trying to implement a python daemon that uses a pool of workers to executes commands using Popen. I have borrowed the basic daemon from http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/

I have only changed the init, daemonize (or equally the start) and stop methods. Here are the changes to the init method:

def __init__(self, pidfile):
#, stdin='/dev/null', stdout='STDOUT', stderr='STDOUT'):
    #self.stdin = stdin
    #self.stdout = stdout
    #self.stderr = stderr
    self.pidfile = pidfile
    self.pool = Pool(processes=4)

I am not setting stdin, stdout and stderr so that I can debug the code with print statements. Also, I have tried moving this pool around to a few places but this is the only place that does not produce exceptions.

Here are the changes to the daemonize method:

def daemonize(self):
    ...

    # redirect standard file descriptors
    #sys.stdout.flush()
    #sys.stderr.flush()
    #si = open(self.stdin, 'r')
    #so = open(self.stdout, 'a+')
    #se = open(self.stderr, 'a+', 0)
    #os.dup2(si.fileno(), sys.stdin.fileno())
    #os.dup2(so.fileno(), sys.stdout.fileno())
    #os.dup2(se.fileno(), sys.stderr.fileno())

    print self.pool

    ...

Same thing, I am not redirecting io so that I can debug. The print here is used so that I can check the pools location.

And the stop method changes:

def stop(self):
    ...

    # Try killing the daemon process
    try:
        print self.pool
        print "closing pool"
        self.pool.close()
        print "joining pool"
        self.pool.join()
        print "set pool to None"
        self.pool = None
        while 1:
            print "kill process"
            os.kill(pid, SIGTERM)

    ...

Here the idea is that I not only need to kill the process but also clean up the pool. The self.pool = None is just a random attempt to solve the issues which didn't work. At first I thought this was a problem with zombie children which was occurring when I had the self.pool.close() and self.pool.join() inside the while loop with the os.kill(pid, SIGTERM). This is before I decided to start looking at the pool location via the print self.pool. After doing this, I believe the pools are not the same when the daemon starts and when it stops. Here is some output:

me@pc:~/pyCode/jobQueue$ sudo ./jobQueue.py start
<multiprocessing.pool.Pool object at 0x1c543d0>
me@pc:~/pyCode/jobQueue$ sudo ./jobQueue.py stop
<multiprocessing.pool.Pool object at 0x1fb7450>
closing pool
joining pool
set pool to None
kill process
kill process
... [ stuck in infinite loop]

The different locations of the objects suggest to me that they are not the same pool and that one of them is probably the zombie?

After CTRL+C, here is what I get from ps aux|grep jobQueue:

root     21161  0.0  0.0  50384  5220 ?        Ss   22:59   0:00 /usr/bin/python ./jobQueue.py start
root     21162  0.0  0.0      0     0 ?        Z    22:59   0:00 [jobQueue.py] <defunct>
me       21320  0.0  0.0   7624   940 pts/0    S+   23:00   0:00 grep --color=auto jobQueue

I have tried moving the self.pool = Pool(processes=4) to a number of different places. If it is moved to the start()' ordaemonize()methods,print self.pool` will throw an exception saying that it is NoneType. In addition, the location seems to change the number of zombie process that will pop up.

Currently, I have not added the functionality to run anything via the workers. My problem seems completely related to setting up the pool of workers correctly. I would appreciate any information that leads to solving this issue or advice about creating a daemon service that uses a pool of workers to execute a series of commands using Popen. Since I haven't gotten that far, I do not know what challenges I face ahead. I am thinking I might just need to write my own pool but if there is a nice trick to make the pool work here, it would be amazing.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

挽梦忆笙歌 2024-11-24 18:44:54

解决方案是将 self.pool = Pool(process=4) 放在 daemonize 方法的最后一行。否则池最终会在某个地方丢失（可能在fork中）。然后可以在 run 方法内访问该池，该方法由您希望守护进程的应用程序重载。但是，无法在 stop 方法中访问该池，否则会导致 NoneType 异常。我相信有一个更优雅的解决方案，但这是有效的，这就是我现在所拥有的。如果我希望在池仍在运行时 stop 失败，我将必须向 run 添加额外的功能和某种形式的消息，但我目前不关心这。