使用make并行执行独立任务
我有很多想要并行执行的命令。命令几乎相同。它们预计需要大约相同的时间,并且可以完全独立运行。它们可能看起来像:
command -n 1 > log.1
command -n 2 > log.2
command -n 3 > log.3
...
command -n 4096 > log.4096
我可以在 shell 脚本中并行启动所有这些任务,但系统会尝试加载超出严格必要的负载以保持 CPU 繁忙(每个任务占用一个核心的 100%,直到完成) )。这会导致磁盘崩溃,并使整个过程比不那么贪婪的执行方法慢。
最好的方法可能是保持大约 n 个任务执行,其中 n 是可用核心的数量。
我不想重新发明轮子。这个问题已经在 Unix make
程序中得到解决(当与 -j n
选项一起使用时)。我想知道是否可以为上述编写通用的 Makefile 规则,以避免出现如下所示的线性大小的 Makefile:
all: log.1 log.2 ...
log.1:
command -n 1 > log.1
log.2:
command -n 2 > log.2
...
如果最好的解决方案不是使用 make
而是使用另一个程序/utility,只要依赖关系合理,我对此持开放态度(make
在这方面非常好)。
I have a bunch of commands I would like to execute in parallel. The commands are nearly identical. They can be expected to take about the same time, and can run completely independently. They may look like:
command -n 1 > log.1
command -n 2 > log.2
command -n 3 > log.3
...
command -n 4096 > log.4096
I could launch all of them in parallel in a shell script, but the system would try to load more than strictly necessary to keep the CPU(s) busy (each task takes 100% of one core until it has finished). This would cause the disk to thrash and make the whole thing slower than a less greedy approach to execution.
The best approach is probably to keep about n
tasks executing, where n
is the number of available cores.
I am keen not to reinvent the wheel. This problem has already been solved in the Unix make
program (when used with the -j n
option). I was wondering if perhaps it was possible to write generic Makefile rules for the above, so as to avoid the linear-size Makefile that would look like:
all: log.1 log.2 ...
log.1:
command -n 1 > log.1
log.2:
command -n 2 > log.2
...
If the best solution is not to use make
but another program/utility, I am open to that as long as the dependencies are reasonable (make
was very good in this regard).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
下面是更可移植的 shell 代码,不依赖于大括号扩展:
请注意使用 := 来定义更有效的变量:简单扩展的“flavor”。
Here is more portable shell code that does not depend on brace expansion:
Note the use of := to define a more efficient variable: the simply expanded "flavor".
请参阅 模式规则
另一种方式,如果这是单一的之所以需要
make
,是为了使用xargs
的-n
和-P
选项。See pattern rules
Another way, if this is the single reason why you need
make
, is to use-n
and-P
options ofxargs
.首先是简单的部分。正如 Roman Cheplyaka 指出的那样,模式规则非常有用:
棘手的部分是创建该列表,
LOGS
。 Make 不太擅长处理数字。最好的方法可能是调用 shell。 (您可能需要针对您的 shell 调整此脚本——shell 脚本编写不是我最擅长的科目。)First the easy part. As Roman Cheplyaka points out, pattern rules are very useful:
The tricky part is creating that list,
LOGS
. Make isn't very good at handling numbers. The best way is probably to call on the shell. (You may have to adjust this script for your shell-- shell scripting isn't my strongest subject.)xargs -P 是执行此操作的“标准”方法。
请注意,根据磁盘 I/O,您可能希望限制为主轴而不是核心。
如果您确实想限制核心,请注意最近 coreutils 中的新 nproc 命令。
xargs -P is the "standard" way to do this.
Note depending on disk I/O you may want to limit to spindles rather than cores.
If you do want to limit to cores note the new nproc command in recent coreutils.
使用 GNU Parallel,您可以编写:
10 秒安装:
了解更多: http://www.gnu .org/software/parallel/parallel_tutorial.html https://www.youtube.com /playlist?list=PL284C9FF2488BC6D1
With GNU Parallel you would write:
10 second installation:
Learn more: http://www.gnu.org/software/parallel/parallel_tutorial.html https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1