Python 多处理

发布于 2024-11-16 13:09:16 字数 382 浏览 6 评论 0原文

这个问题更多的是事实发现和思考过程，而不是面向代码。

我有许多已编译的 C++ 程序，需要在不同时间并使用不同参数运行。我正在考虑使用 Python 多处理从作业队列 (rabbitmq) 读取作业，然后将该作业提供给 C++ 程序来运行（可能是子进程）。我正在研究多处理模块，因为这将全部在双 Xeon 服务器上运行，所以我想充分利用我的服务器的多处理器能力。

Python 程序将是中央管理器，只需从队列中读取作业，使用适当的 C++ 程序生成一个进程（或子进程？）来运行作业，获取结果（子进程 stdout 和 stderr），将其提供给回调并将进程放回进程队列中，等待下一个作业运行。

首先，这听起来像是一个有效的策略吗？

其次，有没有类似的例子？

先感谢您。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

衣神在巴黎 2024-11-23 13:09:16

Python 程序将是
中央经理，只需阅读
来自队列的作业，产生一个进程（或
子进程？）与适当的 C++
程序来运行作业，得到
结果（子进程标准输出和标准错误），
将其提供给回调并将
进程返回到进程队列中
等待下一个作业运行。

为此，您不需要multiprocessing 模块。 multiprocessing 模块非常适合将 Python 函数作为单独的进程运行。要运行 C++ 程序并从 stdout 读取结果，您只需要 subprocess 模块。队列可以是一个列表，当列表非空时，您的 Python 程序将简单地循环。

但是，如果您想

生成多个工作进程，
让它们从公共队列中读取，
请使用队列中的参数
生成 C++ 程序（并行）
使用 C++ 程序的输出
要将新项目放入队列中

，您可以使用 multiprocessing 来完成，如下所示：

test.py:

import multiprocessing as mp
import subprocess
import shlex

def worker(q):
    while True:
        # Get an argument from the queue
        x=q.get()

        # You might change this to run your C++ program
        proc=subprocess.Popen(
            shlex.split('test2.py {x}'.format(x=x)),stdout=subprocess.PIPE)
        out,err=proc.communicate()

        print('{name}: using argument {x} outputs {o}'.format(
            x=x,name=mp.current_process().name,o=out))

        q.task_done()

        # Put a new argument into the queue
        q.put(int(out))

def main():
    q=mp.JoinableQueue()

    # Put some initial values into the queue
    for t in range(1,3):
        q.put(t)

    # Create and start a pool of worker processes
    for i in range(3):
        p=mp.Process(target=worker, args=(q,))
        p.daemon=True
        p.start()
    q.join()
    print "Finished!"

if __name__=='__main__':
    main()

test2.py （一个简单的替代品你的 C++ 程序）：

import time
import sys

x=int(sys.argv[1])
time.sleep(0.5)
print(x+3)

运行 test.py 可能会产生如下结果：

Process-1: using argument 1 outputs 4
Process-3: using argument 3 outputs 6
Process-2: using argument 2 outputs 5
Process-3: using argument 6 outputs 9
Process-1: using argument 4 outputs 7
Process-2: using argument 5 outputs 8
Process-3: using argument 9 outputs 12
Process-1: using argument 7 outputs 10
Process-2: using argument 8 outputs 11
Process-1: using argument 10 outputs 13

请注意，右侧列中的数字被反馈到队列中，并（最终）用作 test2.py 并显示为数字左硬列。

The Python program would be the
central manager and would simply read
jobs from the que, spawn a process (or
subprocess?) with the appropriate C++
program to run the job, get the
results (subprocess stdout & stderr),
feed that to a callback and put the
process back in a que of processes
waiting for the next job to run.

You don't need the multiprocessing module for this. The multiprocessing module is good for running Python functions as separate processes. To run a C++ program and read results from stdout, you'd only need the subprocess module. The queue could be a list, and your Python program would simply loop while the list is non-empty.

However, if you want to

spawn multiple worker processes
have them read from a common queue
use the arguments from the queue to
spawn C++ programs (in parallel)
use the output of the C++ programs
to put new items in the queue

then you could do it with multiprocessing like this:

test.py:

import multiprocessing as mp
import subprocess
import shlex

def worker(q):
    while True:
        # Get an argument from the queue
        x=q.get()

        # You might change this to run your C++ program
        proc=subprocess.Popen(
            shlex.split('test2.py {x}'.format(x=x)),stdout=subprocess.PIPE)
        out,err=proc.communicate()

        print('{name}: using argument {x} outputs {o}'.format(
            x=x,name=mp.current_process().name,o=out))

        q.task_done()

        # Put a new argument into the queue
        q.put(int(out))

def main():
    q=mp.JoinableQueue()

    # Put some initial values into the queue
    for t in range(1,3):
        q.put(t)

    # Create and start a pool of worker processes
    for i in range(3):
        p=mp.Process(target=worker, args=(q,))
        p.daemon=True
        p.start()
    q.join()
    print "Finished!"

if __name__=='__main__':
    main()

test2.py (a simple substitute for your C++ program):

import time
import sys

x=int(sys.argv[1])
time.sleep(0.5)
print(x+3)

Running test.py might yield something like this:

Process-1: using argument 1 outputs 4
Process-3: using argument 3 outputs 6
Process-2: using argument 2 outputs 5
Process-3: using argument 6 outputs 9
Process-1: using argument 4 outputs 7
Process-2: using argument 5 outputs 8
Process-3: using argument 9 outputs 12
Process-1: using argument 7 outputs 10
Process-2: using argument 8 outputs 11
Process-1: using argument 10 outputs 13

Notice that the numbers in the right-hand column are fed back into the queue, and are (eventually) used as arguments to test2.py and show up as numbers in the left-hard column.

回复收藏 0 原文

梦途 2024-11-23 13:09:16

首先，这听起来像是一个有效的策略吗？

是的。

第二，有没有类似的例子？

芹菜

回复收藏 0 原文

幻梦 2024-11-23 13:09:16

听起来是一个不错的策略，但您不需要 multiprocessing 模块，而是需要 subprocess 模块。 subprocess 用于从 Python 程序运行子进程并与它们交互（stdio、stdout、管道等），而 multiprocessing 更多的是关于分发 Python 代码在多个进程中运行以通过并行性获得性能。

根据响应策略，您可能还需要查看线程以从线程启动子进程。这将允许您等待一个子进程，同时仍然响应队列以接受其他作业。