Python如何连续填充子进程的多个线程?
我正在 Linux 上运行一个应用程序 foo。在 Bash 脚本/终端提示符下,我的应用程序使用以下命令运行多线程:
$ foo -config x.ini -threads 4 < inputfile
系统监视器和顶部报告 foo 平均 CPU 负载约为 380%(四核计算机)。我在 Python 2.6x 中重新创建了此功能:
proc = subprocess.Popen("foo -config x.ini -threads 4", \
shell=True, stdin=subprocess.PIPE, \
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
mylist = ['this','is','my','test','app','.']
for line in mylist:
txterr = ''
proc.stdin.write(line.strip()+'\n')
while not proc.poll() and not txterr.count('Finished'):
txterr += subproc.stderr.readline()
print proc.stdout.readline().strip(),
Foo 运行速度较慢,top 报告 CPU 负载为 100%。 Foo 在 shell=False 时也运行良好,但仍然很慢:
proc = subprocess.Popen("foo -config x.ini -threads 4".split(), \
shell=False, stdin=subprocess.PIPE, \
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
有没有办法让 Python 子进程连续填充所有线程?
I'm running an app, foo, on Linux. From a Bash script/terminal prompt, my application runs multi-threaded with this command:
$ foo -config x.ini -threads 4 < inputfile
System Monitor and top report foo averages about 380% CPU load (quad-core machine). I've recreated this functionality in Python 2.6x with:
proc = subprocess.Popen("foo -config x.ini -threads 4", \
shell=True, stdin=subprocess.PIPE, \
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
mylist = ['this','is','my','test','app','.']
for line in mylist:
txterr = ''
proc.stdin.write(line.strip()+'\n')
while not proc.poll() and not txterr.count('Finished'):
txterr += subproc.stderr.readline()
print proc.stdout.readline().strip(),
Foo runs slower and top reports a CPU load of 100%. Foo also runs fine with shell=False, but still slow:
proc = subprocess.Popen("foo -config x.ini -threads 4".split(), \
shell=False, stdin=subprocess.PIPE, \
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
Is there a way to have Python subprocess continuously fill all the threads?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
当您像这样使用 Popen 调用命令时,它是从 Python 还是从 shell 调用并不重要。启动进程的是“foo”命令,而不是 Python。
所以答案是“是的,从 Python 调用时子进程可以是多线程的。”
When you call a command with Popen like this it doesn't matter if it's called from Python or from the shell. It's the "foo" command that starts it's processes, not Python.
So the answer is "Yes, subprocesses can be multi-threaded when called from Python."
首先,您是否猜测它是单线程只是因为它使用了 100% 而不是 400% 的 CPU?
最好使用
top
程序检查已启动的线程数量,按H
键显示线程。或者,使用ps -eLf
并确保NLWP
列显示多个线程。Linux 与 CPU 的亲和力可能会非常不稳定;默认情况下,调度程序不会将进程移离它最后使用的处理器。这意味着,如果程序的所有四个线程都在单个处理器上启动,它们将永远共享该处理器。您必须使用诸如-c 0 ;任务集 -p-c 1 ;任务集 -p-c 2 ;任务集 -p-c 3。
taskset(1)
之类的工具来强制必须在单独的处理器上长时间运行的进程上的 CPU 关联性。例如,taskset -p您可以使用
taskset -p
检索关联性,以了解当前设置的关联性。(有一天,我想知道为什么我的 Folding At Home 进程使用的 CPU 时间比我预期的要少得多,我发现该死的调度程序将三个 FaH 任务放在一个超线程兄弟上,并将第四个 FaH 任务放在另一个 HT 兄弟上相同的核心。其他三个处理器闲置(第一个核心也运行得很热,其他三个核心冷四五度。呵呵。))
First things first, are you guessing it is single-threaded only because it is using 100% of CPU rather than 400%?
It would be better to check how many threads it has started using the
top
program, hit theH
key to show threads. Or, useps -eLf
and make sure theNLWP
column shows multiple threads.Linux can be pretty twitchy with CPU affinity; by default, the scheduler will NOT move a process away from the last processor it used. Which means, if all four threads of your program were started on a single processor, they will ALL share the processor FOR EVER. You must use a tool like
taskset(1)
to force a CPU affinity on processes that must run on separate processors for a long time. e.g.,taskset -p <pid1> -c 0 ; taskset -p <pid2> -c 1 ; taskset -p <pid3> -c 2 ; taskset -p <pid4> -c 3
.You can retrieve the affinity with
taskset -p <pid>
to find out what the affinity is currently set to.(One day I wondered why my Folding At Home processes were using much less than CPU time I expected, I found that the bloody scheduler had placed three FaH tasks on ONE HyperThread sibling and the fourth FaH task on the other HT sibling on the same core. The other three processors were idle. (The first core also ran quite hot, and the other three cores were four or five degrees colder. Heh.))
如果你的 python 脚本没有足够快地为 foo 进程提供数据,那么你可以将读取 stdout、stderr 的任务卸载到线程:
If your python script doesn't feed the
foo
process fast enough then you could offload reading stdout, stderr to threads: