我有一个简单的 bash 脚本,可以将一个进程的输出传输到另一个进程。即:。
dostuff | filterstuff
碰巧在我的 Linux 系统(openSUSE,如果重要的话,内核 2.6.27)上,这两个进程都在单个内核上运行。但是,在不同内核上运行不同进程是默认策略,在这种情况下不会触发。
系统的哪个组件负责此操作?我应该怎样做才能利用多核功能?
注意在2.6.30内核上不存在此问题。
澄清:遵循丹尼斯·威廉姆森的建议后,我确信 /em> 对于 top 程序,管道进程确实总是在同一处理器上运行。 Linux 调度程序通常做得很好,但这次却不行了。
我认为 bash 中的某些内容阻止操作系统执行此操作。问题是我需要一个适用于多核和单核机器的便携式解决方案。 任务集
Dennis Williamson 提出的解决方案 不适用于单核机器。目前我正在使用:,
dostuff | taskset -c 0 filterstuff
但这似乎是一个肮脏的黑客。有人能提供更好的解决方案吗?
I have a simple bash script that pipes output of one process to another. Namely:.
dostuff | filterstuff
It happens that on my Linux system (openSUSE if it matters, kernel 2.6.27) these both processes run on a single core. However, running different processes on different cores is a default policy that doesn't happen to trigger in this case.
What component of the system is responsible for that and what should I do to utilize multicore feature?
Note that there's no such problem on 2.6.30 kernel.
Clarification: Having followed Dennis Williamson's advice, I made sure with top program, that piped processes are indeed always run on the same processor. Linux scheduler, which usually does a really good job, this time doesn't do it.
I figure that something in bash prevents OS from doing it. The thing is that I need a portable solution for both multi-core and single-core machines. The taskset
solution proposed by Dennis Williamson won't work on single-core machines. Currently I'm using:,
dostuff | taskset -c 0 filterstuff
but this seems like a dirty hack. Could anyone provide a better solution?
发布评论
评论(3)
假设
dostuff
在一个 CPU 上运行。它将数据写入管道,并且该数据将位于该 CPU 的缓存中。由于filterstuff
正在从该管道读取数据,因此调度程序决定在同一 CPU 上运行它,以便其输入数据已经在缓存中。如果您的内核是使用
CONFIG_SCHED_DEBUG=y
构建的,则应禁用此类启发式方法。 (有关其他调度程序可调参数,请参阅
/usr/src/linux/kernel/sched_features.h
和/proc/sys/kernel/sched_*
。)如果有帮助,那么较新的内核仍然会出现问题,并且在单独的 CPU 上运行确实比在一个 CPU 上运行要快,请将问题报告给 Linux 内核邮件列表,以便他们可以调整其启发式方法。
Suppose
dostuff
is running on one CPU. It writes data into a pipe, and that data will be in cache on that CPU. Becausefilterstuff
is reading from that pipe, the scheduler decides to run it on the same CPU, so that its input data is already in cache.If your kernel is built with
CONFIG_SCHED_DEBUG=y
,should disable this class of heuristics. (See
/usr/src/linux/kernel/sched_features.h
and/proc/sys/kernel/sched_*
for other scheduler tunables.)If that helps, and the problem still happens with a newer kernel, and it's really faster to run on separate CPUs than one CPU, please report the problem to the Linux Kernel Mailing List so that they can adjust their heuristics.
尝试设置 CPU(处理器)亲和性:
编辑:
尝试这个实验:
创建一个名为 proctest 的文件,并以此作为
chmod +x proctest
内容:<前><代码>#!/bin/bash
虽然是真的
做
附注
睡觉2
完毕
开始运行:
<前><代码>./proctest | grep bash
ps u
使用最高的几个进程的 PID 列表(例如其中 8 个)启动
top -p
,该列表来自退出的top
留在屏幕上的列表以及由ps
列出的proctest
和grep
- 全部用逗号分隔,如下所示(顺序无关紧要):<预><代码>顶部 -p 1234、1255、1211、1212、1270、1275、1261、1250、16521、16522
.09
并按 Enter 设置较短的延迟时间,proctest
和grep
来回跳动,有时在同一处理器上,有时在不同的处理器上Give this a try to set the CPU (processor) affinity:
Edit:
Try this experiment:
create a file called proctest and
chmod +x proctest
with this as the contents:start this running:
ps u
start
top -p
with a list of the PIDs of the highest several processes, say 8 of them, from the list left on-screen by the exitedtop
plus the ones forproctest
andgrep
which were listed byps
- all separated by commas, like so (the order doesn't matter):.09
and press Enter to set a short delay timeproctest
andgrep
bounce around, sometimes on the same processor, sometimes on different onesLinux 调度程序旨在提供最大吞吐量,而不是做您想象的最好的事情。如果您正在运行与管道连接的进程,很可能其中一个进程会阻塞另一个进程,然后它们就会交换。在单独的内核上运行它们几乎不会实现任何目标,所以它不会。
如果您有两个真正准备好在 CPU 上运行的任务,我希望看到它们被安排在不同的内核上(在某些时候)。
我的猜测是,发生的情况是 dostuff 运行直到管道缓冲区变满,此时它无法再运行,因此“filterstuff”进程运行,但它运行的时间很短,以至于 dostuff 无法运行重新调度,直到filterstuff完成过滤整个管道缓冲区,此时dostuff将再次调度。
The Linux scheduler is designed to give maximum throughput, not do what you imagine is best. If you're running processes which are connected with a pipe, in all likelihood, one of them is blocking the other, then they swap over. Running them on separate cores would achieve little or nothing, so it doesn't.
If you have two tasks which are both genuinely ready to run on the CPU, I'd expect to see them scheduled on different cores (at some point).
My guess is, what happens is that dostuff runs until the pipe buffer becomes full, at which point it can't run any more, so the "filterstuff" process runs, but it runs for such a short time that dostuff doesn't get rescheduled until filterstuff has finished filtering the entire pipe buffer, at which point dostuff then gets scheduled again.