为什么我必须将 .wait() 与 python 的子进程模块一起使用？

发布于 2024-10-02 01:27:58 字数 714 浏览 0 评论 0原文

我正在 Linux 上通过 Python 中的子进程模块运行 Perl 脚本。使用变量输入多次调用运行脚本的函数。

def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)

但是，如果我运行此函数（例如两次），则当第二个进程启动时，第一个进程的执行将停止。添加来获得我想要的行为

process.wait()

我可以通过在调用脚本后，所以我并没有真正陷入困境。但是，我想找出为什么我不能根据需要多次使用子进程运行脚本，并且让脚本并行进行这些计算，而不必在每次运行之间等待它完成。

更新

罪魁祸首并不是那么令人兴奋：perl 脚本使用了一个为每次执行而重写的公共文件。

然而，我从中学到的教训是，一旦进程开始运行，垃圾收集器就不会删除该进程，因为一旦我整理好它，这对我的脚本没有任何影响。

原文

I'm running a Perl script through the subprocess module in Python on Linux. The function that runs the script is called several times with variable input.

def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)

However, if I run this function, say, twice, the execution of the first process will stop when the second process starts. I can get my desired behavior by adding

process.wait()

after calling the script, so I'm not really stuck. However, I want find out why I cannot run the script using subprocess as many times as I want, and have the script make these computations in parallel, without having to wait for it to finish between each run.

UPDATE

The culprit was not so exciting: the perl script used a common file that was rewritten for each execution.

However, the lesson I learned from this was that the garbage collector does not delete the process once it starts running, because this had no influence on my script once I got it sorted out.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

水溶 2024-10-09 01:27:58

如果您使用 Unix，并且希望在后台运行多个进程，您可以这样使用 subprocess.Popen：

x_fork_many.py:

import subprocess
import os
import sys
import time
import random
import gc  # This is just to test the hypothesis that garbage collection of p=Popen() causing the problem.

# This spawns many (3) children in quick succession
# and then reports as each child finishes.
if __name__=='__main__':
    N=3
    if len(sys.argv)>1:
        x=random.randint(1,10)
        print('{p} sleeping for {x} sec'.format(p=os.getpid(),x=x))
        time.sleep(x)
    else:
        for script in xrange(N): 
            args=['test.py','sleep'] 
            p = subprocess.Popen(args)
        gc.collect()
        for i in range(N):
            pid,retval=os.wait()
            print('{p} finished'.format(p=pid))

输出看起来像这样：

% x_fork_many.py 
15562 sleeping for 10 sec
15563 sleeping for 5 sec
15564 sleeping for 6 sec
15563 finished
15564 finished
15562 finished

我不知道为什么你当不调用 .wait() 时，会出现奇怪的行为。然而，上面的脚本表明（至少在unix上）没有必要将subprocess.Popen(...)进程保存在列表或集合中。无论问题是什么，我认为这与垃圾收集无关。

附言。也许您的 Perl 脚本在某种程度上存在冲突，导致一个脚本在另一个脚本运行时以错误结束。您是否尝试过从命令行启动对 perl 脚本的多次调用？

If you are using Unix, and wish to run many processes in the background, you could use subprocess.Popen this way:

x_fork_many.py:

import subprocess
import os
import sys
import time
import random
import gc  # This is just to test the hypothesis that garbage collection of p=Popen() causing the problem.

# This spawns many (3) children in quick succession
# and then reports as each child finishes.
if __name__=='__main__':
    N=3
    if len(sys.argv)>1:
        x=random.randint(1,10)
        print('{p} sleeping for {x} sec'.format(p=os.getpid(),x=x))
        time.sleep(x)
    else:
        for script in xrange(N): 
            args=['test.py','sleep'] 
            p = subprocess.Popen(args)
        gc.collect()
        for i in range(N):
            pid,retval=os.wait()
            print('{p} finished'.format(p=pid))

The output looks something like this:

% x_fork_many.py 
15562 sleeping for 10 sec
15563 sleeping for 5 sec
15564 sleeping for 6 sec
15563 finished
15564 finished
15562 finished

I'm not sure why you are getting the strange behavior when not calling .wait(). However, the script above suggests (at least on unix) that saving subprocess.Popen(...) processes in a list or set is not necessary. Whatever the problem is, I don't think it has to do with garbage collection.

PS. Maybe your perl scripts are conflicting in some way, which causes one to end with an error when another one is running. Have you tried starting multiple calls to the perl script from the command line?

回复收藏 0 原文

情痴 2024-10-09 01:27:58

您必须调用 wait() 才能要求“等待”popen 的结束。

当 popen 在后台执行 perl 脚本时，如果您不 wait()，它将在对象“process”生命周期结束时停止......即在 script_runner 结束时。

回复收藏 0 原文

提笔书几行 2024-10-09 01:27:58

正如 ericdupo 所说，任务被终止是因为您用新的 Popen 对象覆盖了您的 process 变量，并且因为不再引用之前的 Popen< /code> 对象，它被垃圾收集器销毁。您可以通过在某个地方（例如列表）保留对对象的引用来防止这种情况：

processes = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    processes.append(process)

这应该足以防止您之前的 Popen 对象被销毁

As said by ericdupo, the task is killed because you overwrite your process variable with a new Popen object, and since there are no more references to your previous Popen object, it is destroyed by the garbage collector. You can prevent this by keeping a reference to your objects somewhere, like a list:

processes = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    processes.append(process)

This should be enough to prevent your previous Popen object from being destroyed

回复收藏 0 原文

心房的律动 2024-10-09 01:27:58

我认为你想要这样做，

list_process = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    list_process.append(process)
#call several times script_runner
for process in list_process:
    process.wait()

你的进程将并行运行

I think you want to do

list_process = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    list_process.append(process)
#call several times script_runner
for process in list_process:
    process.wait()

so your process will be run in parallel

回复收藏 0 原文

~没有更多了~