在 Python 中 fork 多个 shell 命令/进程的最佳方法？

发布于 2024-12-14 14:03:06 字数 290 浏览 0 评论 0原文

我见过的大多数 os.fork 和 subprocess/multiprocessing 模块的示例都展示了如何 fork 调用 python 脚本或一段 python 代码的新实例。同时生成一组任意 shell 命令的最佳方法是什么？

我想，我可以只使用 subprocess.call 或其中一个 Popen 命令并将输出通过管道传输到一个文件，我相信该文件会立即返回，至少返回给调用者。我知道这并不难做到，我只是想找出最简单、最Pythonic 的方法来做到这一点。

提前致谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

驱逐舰岛风号 2024-12-21 14:03:06

对 subprocess.Popen 的所有调用都会立即返回给调用者。这是对 wait 和 communicate 块的调用。因此，您需要做的就是使用 subprocess.Popen 启动多个进程（为了安全起见，将 stdin 设置为 /dev/null），然后一一调用 communicate直到它们全部完成。

当然，我假设您只是想启动一堆不相关（即没有通过管道连接在一起）的命令。

回复收藏 0 原文

故事与诗 2024-12-21 14:03:06

我喜欢使用 PTY 而不是管道。对于一堆我只想捕获错误消息的进程，我这样做了。

RNULL = open('/dev/null', 'r')
WNULL = open('/dev/null', 'w')
logfile = open("myprocess.log", "a", 1)
REALSTDERR = sys.stderr
sys.stderr = logfile

下一部分是一个循环，产生大约 30 个进程。

sys.stderr = REALSTDERR
master, slave = pty.openpty()
self.subp = Popen(self.parsed, shell=False, stdin=RNULL, stdout=WNULL, stderr=slave)
sys.stderr = logfile

之后，我有一个 select 循环，它收集所有错误消息并将它们发送到单个日志文件。使用 PTY 意味着我永远不必担心部分线条会混淆，因为线条规则提供了简单的框架。

I like to use PTYs instead of pipes. For a bunch of processes where I only want to capture error messages I did this.

RNULL = open('/dev/null', 'r')
WNULL = open('/dev/null', 'w')
logfile = open("myprocess.log", "a", 1)
REALSTDERR = sys.stderr
sys.stderr = logfile

This next part was in a loop spawning about 30 processes.

sys.stderr = REALSTDERR
master, slave = pty.openpty()
self.subp = Popen(self.parsed, shell=False, stdin=RNULL, stdout=WNULL, stderr=slave)
sys.stderr = logfile

After this I had a select loop which collected any error messages and sent them to the single log file. Using PTYs meant that I never had to worry about partial lines getting mixed up because the line discipline provides simple framing.

回复收藏 0 原文

儭儭莪哋寶赑 2024-12-21 14:03:06

没有适合所有可能情况的最佳方案。最好取决于手头的问题。

以下是如何生成进程并将其输出保存到结合 stdout/stderr 的文件中：

import subprocess
import sys

def spawn(cmd, output_file):
    on_posix = 'posix' in sys.builtin_module_names
    return subprocess.Popen(cmd, close_fds=on_posix, bufsize=-1,
                            stdin=open(os.devnull,'rb'),
                            stdout=output_file,
                            stderr=subprocess.STDOUT)

生成可以与脚本并行运行且彼此并行运行的多个进程：

processes, files = [], []
try:
    for i, cmd in enumerate(commands):
        files.append(open('out%d' % i, 'wb'))
        processes.append(spawn(cmd, files[-1]))
finally:
    for p in processes:
        p.wait()
    for f in files: 
        f.close()

注意：cmd 是一个随处可见的列表。

There is no best for all possible circumstances. The best depends on the problem at hand.

Here's how to spawn a process and save its output to a file combining stdout/stderr:

import subprocess
import sys

def spawn(cmd, output_file):
    on_posix = 'posix' in sys.builtin_module_names
    return subprocess.Popen(cmd, close_fds=on_posix, bufsize=-1,
                            stdin=open(os.devnull,'rb'),
                            stdout=output_file,
                            stderr=subprocess.STDOUT)

To spawn multiple processes that can run in parallel with your script and each other:

processes, files = [], []
try:
    for i, cmd in enumerate(commands):
        files.append(open('out%d' % i, 'wb'))
        processes.append(spawn(cmd, files[-1]))
finally:
    for p in processes:
        p.wait()
    for f in files: 
        f.close()

Note: cmd is a list everywhere.

回复收藏 0 原文

冷夜 2024-12-21 14:03:06

我想，我可以只使用 subprocess.call 或 Popen 之一
命令并将输出通过管道传输到文件，我相信该文件会返回
立即，至少对呼叫者来说。

如果您想处理数据，这不是一个好方法。

在这种情况下，最好

sp = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)

先执行 sp.communicate() 或直接从 sp.stdout.read() 读取。

如果稍后要在调用程序中处理数据，有两种方法：

您可以尽快检索数据，也许通过单独的线程，读取它们并将它们存储在消费者可以获取它们的地方.
您可以让生产子进程在需要时阻止并从中检索数据。子进程会生成管道缓冲区（通常为 64 kiB）中容纳的尽可能多的数据，然后阻止进一步的写入。一旦需要数据，您就可以从子进程对象的 stdout（也可能是 stderr）中读取并使用它们 - 或者，稍后再次使用 sp.communicate()。

如果生成数据需要很长时间，那么您的 wprogram 就必须等待，则采用方法 1。

如果数据量很大和/或数据生成速度太快以至于缓冲没有意义，则首选方式 2。

I suppose, I could just us subprocess.call or one of the Popen
commands and pipe the output to a file, which I believe will return
immediately, at least to the caller.

That's not a good way to do it if you want to process the data.

In this case, better do

sp = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)

and then sp.communicate() or read directly from sp.stdout.read().

If the data shall be processed in the calling program at a later time, there are two ways to go:

You can retrieve the data ASAP, maybe via a separate thread, reading them and storing them somewhere where the consumer can get them.
You can have the producing subprocess have block and retrieve the data from it when you need them. The subprocess produces as many data as fit in the pipe buffer (usually 64 kiB) and then blocks on further writes. As soon as you need the data, you read() from the subprocess object's stdout (maybe stderr as well) and use them - or, again, you use sp.communicate() at that later time.

Way 1 would the way to go if producing the data needs much time, so that your wprogram would have to wait.

Way 2 would be to be preferred if the size of the data is quite huge and/or the data is produced so fast that buffering would make no sense.

回复收藏 0 原文