退出守护进程时出现问题

发布于 2024-07-25 20:58:23 字数 2056 浏览 4 评论 0原文

我正在编写一个守护程序,它会生成几个其他子进程。 在我运行 stop 脚本后,主进程在打算退出时继续运行,这真的让我很困惑。

import daemon, signal
from multiprocessing import Process, cpu_count, JoinableQueue
from http import httpserv
from worker import work

class Manager:
    """
    This manager starts the http server processes and worker
    processes, creates the input/output queues that keep the processes
    work together nicely.
    """
    def __init__(self):
        self.NUMBER_OF_PROCESSES = cpu_count()

    def start(self):
        self.i_queue = JoinableQueue()
        self.o_queue = JoinableQueue()

        # Create worker processes
        self.workers = [Process(target=work,
                                args=(self.i_queue, self.o_queue))
                        for i in range(self.NUMBER_OF_PROCESSES)]
        for w in self.workers:
            w.daemon = True
            w.start()

        # Create the http server process
        self.http = Process(target=httpserv, args=(self.i_queue, self.o_queue))
        self.http.daemon = True
        self.http.start()

        # Keep the current process from returning
        self.running = True
        while self.running:
            time.sleep(1)

    def stop(self):
        print "quiting ..."

        # Stop accepting new requests from users
        os.kill(self.http.pid, signal.SIGINT)

        # Waiting for all requests in output queue to be delivered
        self.o_queue.join()

        # Put sentinel None to input queue to signal worker processes
        # to terminate
        self.i_queue.put(None)
        for w in self.workers:
            w.join()
        self.i_queue.join()

        # Let main process return
        self.running = False


import daemon

manager = Manager()
context = daemon.DaemonContext()
context.signal_map = {
        signal.SIGHUP: lambda signum, frame: manager.stop(),
        }

context.open()
manager.start()

stop 脚本只是一行代码 os.kill(pid, signal.SIGHUP),但之后子进程(工作进程和 http 服务器进程)就很好地结束了,但主要过程只是停留在那里,我不知道是什么阻止了它返回。

I am writing a daemon program that spawns several other children processes. After I run the stop script, the main process keeps running when it's intended to quit, this really confused me.

import daemon, signal
from multiprocessing import Process, cpu_count, JoinableQueue
from http import httpserv
from worker import work

class Manager:
    """
    This manager starts the http server processes and worker
    processes, creates the input/output queues that keep the processes
    work together nicely.
    """
    def __init__(self):
        self.NUMBER_OF_PROCESSES = cpu_count()

    def start(self):
        self.i_queue = JoinableQueue()
        self.o_queue = JoinableQueue()

        # Create worker processes
        self.workers = [Process(target=work,
                                args=(self.i_queue, self.o_queue))
                        for i in range(self.NUMBER_OF_PROCESSES)]
        for w in self.workers:
            w.daemon = True
            w.start()

        # Create the http server process
        self.http = Process(target=httpserv, args=(self.i_queue, self.o_queue))
        self.http.daemon = True
        self.http.start()

        # Keep the current process from returning
        self.running = True
        while self.running:
            time.sleep(1)

    def stop(self):
        print "quiting ..."

        # Stop accepting new requests from users
        os.kill(self.http.pid, signal.SIGINT)

        # Waiting for all requests in output queue to be delivered
        self.o_queue.join()

        # Put sentinel None to input queue to signal worker processes
        # to terminate
        self.i_queue.put(None)
        for w in self.workers:
            w.join()
        self.i_queue.join()

        # Let main process return
        self.running = False


import daemon

manager = Manager()
context = daemon.DaemonContext()
context.signal_map = {
        signal.SIGHUP: lambda signum, frame: manager.stop(),
        }

context.open()
manager.start()

The stop script is just a one-liner os.kill(pid, signal.SIGHUP), but after that the children processes (worker processes and http server process) end nicely, but the main process just stays there, I don't know what keeps it from returning.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

川水往事 2024-08-01 20:58:23

您创建了 http 服务器进程,但不 join() 它。 如果您不执行 os.kill() 来停止 http 服务器进程,而是向其发送停止处理哨兵(None,就像发送到工人)然后执行self.http.join()

更新:您还需要为每个工作人员None标记发送到输入队列一次。 您可以尝试:

    for w in self.workers:
        self.i_queue.put(None)
    for w in self.workers:
        w.join()

注意,您需要两个循环的原因是,如果您将 None 放入执行 join() 的同一循环中的队列中,则 w 之外的工作线程无法拾取任何内容,因此加入 w 将导致调用者阻塞。

您没有显示工作人员或http服务器的代码,因此我假设这些在调用task_done等方面表现良好,并且每个工作人员一旦看到None就会退出,而无需get() - 从输入队列中获取更多内容。

另请注意,JoinableQueue 至少存在一个未解决的、难以重现的问题 .task_done(),这可能会困扰你。

You create the http server process but don't join() it. What happens if, rather than doing an os.kill() to stop the http server process, you send it a stop-processing sentinel (None, like you send to the workers) and then do a self.http.join()?

Update: You also need to send the None sentinel to the input queue once for each worker. You could try:

    for w in self.workers:
        self.i_queue.put(None)
    for w in self.workers:
        w.join()

N.B. The reason you need two loops is that if you put the None into the queue in the same loop that does the join(), that None may be picked up by a worker other than w, so joining on w will cause the caller to block.

You don't show the code for workers or http server, so I assume these are well-behaved in terms of calling task_done etc. and that each worker will quit as soon as it sees a None, without get()-ing any more things from the input queue.

Also, note that there is at least one open, hard-to-reproduce issue with JoinableQueue.task_done(), which may be biting you.

一曲爱恨情仇 2024-08-01 20:58:23

我尝试了一种不同的方法,这似乎有效(请注意,我取出了代码的守护进程部分,因为我没有安装该模块)。

import signal

class Manager:
    """
    This manager starts the http server processes and worker
    processes, creates the input/output queues that keep the processes
    work together nicely.
    """
    def __init__(self):
        self.NUMBER_OF_PROCESSES = cpu_count()

    def start(self):

       # all your code minus the loop

       print "waiting to die"

       signal.pause()

    def stop(self):
        print "quitting ..."

        # all your code minus self.running


manager = Manager()

signal.signal(signal.SIGHUP, lambda signum, frame: manager.stop())

manager.start()

一个警告是 signal.pause() 将对任何信号取消暂停,因此您可能需要相应地更改代码。

编辑:

以下内容对我来说效果很好:

import daemon
import signal
import time

class Manager:
    """
    This manager starts the http server processes and worker
    processes, creates the input/output queues that keep the processes
    work together nicely.
    """
    def __init__(self):
        self.NUMBER_OF_PROCESSES = 5

    def start(self):

       # all your code minus the loop

       print "waiting to die"
       self.running = 1
       while self.running:
           time.sleep(1)

       print "quit"



    def stop(self):
        print "quitting ..."

        # all your code minus self.running

        self.running = 0


manager = Manager()

context = daemon.DaemonContext()
context.signal_map = {signal.SIGHUP : lambda signum, frame: manager.stop()}

context.open()
manager.start()

您使用的是哪个版本的Python?

I tried a different approach, and this seems to work (note I took out the daemon portions of the code as I didn't have that module installed).

import signal

class Manager:
    """
    This manager starts the http server processes and worker
    processes, creates the input/output queues that keep the processes
    work together nicely.
    """
    def __init__(self):
        self.NUMBER_OF_PROCESSES = cpu_count()

    def start(self):

       # all your code minus the loop

       print "waiting to die"

       signal.pause()

    def stop(self):
        print "quitting ..."

        # all your code minus self.running


manager = Manager()

signal.signal(signal.SIGHUP, lambda signum, frame: manager.stop())

manager.start()

One warning, is that signal.pause() will unpause for any signal, so you may want to change your code accordingly.

EDIT:

The following works just fine for me:

import daemon
import signal
import time

class Manager:
    """
    This manager starts the http server processes and worker
    processes, creates the input/output queues that keep the processes
    work together nicely.
    """
    def __init__(self):
        self.NUMBER_OF_PROCESSES = 5

    def start(self):

       # all your code minus the loop

       print "waiting to die"
       self.running = 1
       while self.running:
           time.sleep(1)

       print "quit"



    def stop(self):
        print "quitting ..."

        # all your code minus self.running

        self.running = 0


manager = Manager()

context = daemon.DaemonContext()
context.signal_map = {signal.SIGHUP : lambda signum, frame: manager.stop()}

context.open()
manager.start()

What version of python are you using?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文