如何使用 Platform LSF blaunch 命令同时启动进程?

发布于 2024-12-12 18:13:51 字数 478 浏览 2 评论 0原文

我很难弄清楚为什么我无法使用 LSF blaunch 命令并行启动命令:

for num in `seq 3`; do
blaunch -u JobHost ./cmd_${num}.sh &
done

错误消息:

Oct 29 13:08:55 2011 18887 3 7.04 lsb_launch(): Failed while executing tasks.
Oct 29 13:08:55 2011 18885 3 7.04 lsb_launch(): Failed while executing tasks.
Oct 29 13:08:55 2011 18884 3 7.04 lsb_launch(): Failed while executing tasks.

删除与号 (&) 允许命令顺序执行,但我追求并行执行。

I'm having a hard time figuring out why I can't launch commands in parallel using the LSF blaunch command:

for num in `seq 3`; do
blaunch -u JobHost ./cmd_${num}.sh &
done

Error message:

Oct 29 13:08:55 2011 18887 3 7.04 lsb_launch(): Failed while executing tasks.
Oct 29 13:08:55 2011 18885 3 7.04 lsb_launch(): Failed while executing tasks.
Oct 29 13:08:55 2011 18884 3 7.04 lsb_launch(): Failed while executing tasks.

Removing the ampersand (&) allows the commands to execute sequentially, but I am after parallel execution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

权谋诡计 2024-12-19 18:13:51

当在 bsub 上下文中执行时,单次调用 blaunch -u即可。将采用 并在 中指定的所有主机上并行运行它,只要这些主机位于工作的分配。

您想要做的是使用 3 个单独的 blaunch 调用来运行 3 个单独的命令。我在文档中找不到它,但对最新版本的 LSF 的一些测试表明,此类作业中每个单独执行的任务都有一个唯一的任务 ID,存储在名为 LSF_PM_TASKID 的环境变量中。您可以在您的 LSF 版本中通过运行以下命令来验证这一点:

blaunch -I -n <num_tasks> blaunch env | grep TASKID

现在,这与您的问题有什么关系?您希望通过 blaunch 并行运行 i=1,2,3 的 ./cmd_$i.sh。为此,您可以编写一个脚本,我将其称为 cmd.sh,如下所示:

#!/bin/sh
./cmd_${LSF_PM_TASKID}.sh

现在,您可以将 for 循环替换为单次调用 blaunch,如下所示:

blaunch -u JobHost cmd.sh

这将在“JobHost”文件中列出的每台主机上并行运行一个 cmd.sh 实例,每个实例都将运行 shell 脚本 cmd_X。 sh 其中 X 是该特定任务的 $LSF_PM_TASKID 值。

如果“JobHost”中正好有 3 个主机名,那么您将获得 3 个 cmd.sh 实例,这将依次导致 cmd_1.sh各一个实例cmd_2.shcmd_3.sh

When executed within the context of bsub, a single invocation of blaunch -u <hostfile> <cmd> will take <cmd> and run it on all the hosts specified in <hostfile> in parallel as long as those hosts are within the job's allocation.

What you're trying to do is use 3 separate invocations of blaunch to run 3 separate commands. I can't find it in the documentation, but just some testing on a recent version of LSF shows that each individually executed task in such a job has a unique task ID stored for it in an environment variable called LSF_PM_TASKID. You can verify this in your version of LSF by running something like:

blaunch -I -n <num_tasks> blaunch env | grep TASKID

Now, what does this have to do with your question? You want to run ./cmd_$i.sh for i=1,2,3 in parallel through blaunch. To do this you can write a single script which I'll call cmd.sh as follows:

#!/bin/sh
./cmd_${LSF_PM_TASKID}.sh

Now you can replace your for loop with a single invocation of blaunch like so:

blaunch -u JobHost cmd.sh

This will run one instance of cmd.sh on each host listed in the file 'JobHost' in parallel, each of these instances will run the shell script cmd_X.sh where X is the value of $LSF_PM_TASKID for that particular task.

If there's exactly 3 hostnames in 'JobHost' then you will get 3 instances of cmd.sh which will in turn lead to one instance each of cmd_1.sh, cmd_2.sh, and cmd_3.sh

爱本泡沫多脆弱 2024-12-19 18:13:51

您尝试过nohup吗?这可能有效:

for num in `seq 3`; do
nohup blaunch -u JobHost ./cmd_${num}.sh &>/dev/null &
done

Have you tried nohup? This might work:

for num in `seq 3`; do
nohup blaunch -u JobHost ./cmd_${num}.sh &>/dev/null &
done
沒落の蓅哖 2024-12-19 18:13:51

blaunch 不能在 bsub 提供的作业执行环境之外使用。我不知道如何处理为每个进程运行不同的命令,但请尝试以下操作:

bsub -n 3 blaunch ./cmd.sh

blaunch is not to be used outside of the job execution environment provided by bsub. I don't know how to handle running different commands for each process, but try something like:

bsub -n 3 blaunch ./cmd.sh
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文