配置 Slurm 使其具有比内核更多的 MPI 任务

发布于 2025-01-09 17:47:42 字数 4289 浏览 0 评论 0 原文

我正在一组 Raspberry Pi 4 上设置 Slurm。我已成功在 24 RPi 集群上配置和使用 Slurm,并允许每个 RPi 执行 4 个 MPI 任务。因此,如果我对所有节点(-N 24)和所有核心(-n 96)进行 MPI 运行(使用“srun”或“sbatch”(和批处理脚本)),它就会按我的预期工作。

RPi 节点的典型“slurm.conf”行示例如下:

NodeName=foo-001 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN

有 24 个节点条目这种形式。整个分区条目为:

PartitionName=general Nodes=foo-[001-024] Default=YES MaxTime=INFINITE State=UP

我想为每个 RPi 运行 4 个以上 MPI 任务(以测试过载)核心)。我无法设置 Slurm 配置来执行此操作。

我不想每个 MPI 任务有多个线程。我希望能够设置 Slurm,以便每个节点允许 8 或 16 个 MPI 任务,总共分别有 192 或 384 个 MPI 任务。

这是我的整体“slurm.conf”文件:

# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
SlurmctldHost=foo-001
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/cgroup
ReturnToService=1
SlurmctldPidFile=/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=tjl-pi-pharm
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
#SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=info
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# ARRAY LIMITS
MaxArraySize=100000
MaxJobCount=1000000
#
#
# COMPUTE NODES
NodeName=foo-001 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-002 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-003 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-004 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-005 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-006 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-007 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-008 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-009 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-010 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-011 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-012 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-013 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-014 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-015 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-016 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-017 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-018 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-019 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-020 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-021 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-022 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-023 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-024 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
PartitionName=general Nodes=foo-[001-024] Default=YES MaxTime=INFINITE State=UP

我添加了一些附加信息来阐明我想要实现的目标。

对于我当前的工作,我正在使用 Slurm“数组”功能。例如,我的批处理脚本(使用“sbatch”提交)的开头是:

#!/bin/bash
#SBATCH -J bb
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 1
#SBATCH -a 0-9970%96
#SBATCH -t 4:00:00
#SBATCH -o enum-%A_%a.txt
...

这有效并导致脚本中的“...”在所有 96 个核心上独立执行(这是一组令人尴尬的并行计算)。我想用“%192”替换“%96”,并让 192 个作业同时在 96 个核心上运行。目前我的 Slurm 配置不会发生这种情况。相反,Slurm 仍然运行 96 个作业并在这些作业完成后进行填补。我已经尝试将“-O”/“--overcommit”标志设置为“sbatch”,但它似乎并没有改变行为。

I'm setting up Slurm on a cluster of Raspberry Pi 4's. I have succeeded in configuring and using Slurm on my 24 RPi cluster and allowing 4 MPI tasks per RPi. So, if I make an MPI run (either using "srun" or "sbatch" (and a batch script)) with all the nodes (-N 24) and all the cores (-n 96) it works as I would expect.

An example of a typical "slurm.conf" line for a RPi node is:

NodeName=foo-001 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN

There are 24 nodes entries of this form. The overall partition entry is:

PartitionName=general Nodes=foo-[001-024] Default=YES MaxTime=INFINITE State=UP

I would like to run more than 4 MPI tasks per RPi (to test overloading the cores). I haven't been able to set up the Slurm configuration to do this.

I don't want to multiple threads per MPI task. I want to be able to set up Slurm so that it allows 8 or 16 MPI tasks per node for a total of 192 or 384 MPI tasks, respectively.

Here is my overall "slurm.conf" file:

# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
SlurmctldHost=foo-001
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/cgroup
ReturnToService=1
SlurmctldPidFile=/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=tjl-pi-pharm
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
#SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=info
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# ARRAY LIMITS
MaxArraySize=100000
MaxJobCount=1000000
#
#
# COMPUTE NODES
NodeName=foo-001 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-002 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-003 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-004 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-005 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-006 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-007 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-008 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-009 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-010 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-011 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-012 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-013 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-014 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-015 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-016 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-017 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-018 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-019 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-020 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-021 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-022 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-023 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
NodeName=foo-024 CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
PartitionName=general Nodes=foo-[001-024] Default=YES MaxTime=INFINITE State=UP

I'm adding some additional information to clarify what I'm trying to achieve.

For my current work, I'm using the Slurm "array" capability. For example, the start of my batch script (submitted using "sbatch") is:

#!/bin/bash
#SBATCH -J bb
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 1
#SBATCH -a 0-9970%96
#SBATCH -t 4:00:00
#SBATCH -o enum-%A_%a.txt
...

This works and causes the "..." in the script to be executed on all 96 cores independently (this is an embarrassingly parallel set of computations). I'd like to replace "%96" with "%192" and have 192 jobs to be run on 96 cores simultaneously. This doesn't currently happen with my Slurm configuration. Instead, Slurm still runs 96 jobs and fills in as these complete. I have tried the "-O"/"--overcommit" flag to "sbatch" and it doesn't seem to change the behavior.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

十级心震 2025-01-16 17:47:42

您可以指示 Slurm 为每个核心分配多个任务,而不是更改 Slurm 配置。在分配(即作业脚本)中,您可以添加 --ntasks- per-core=4 并使用srun参数--overcommit

作业脚本示例:

#!/bin/bash
[...]
#SBATCH -N 24
#SBATCH --ntasks-per-core=4
#SBATCH -n 96
[...]

srun --overcommit -n 384 ./your-prog

Instead of changing your Slurm configuration, you could instruct Slurm to allocate multiple tasks per core. In the allocation (i.e. the jobscript) you can add --ntasks-per-core=4 and start the MPI program with the srun parameter --overcommit.

Example jobscript:

#!/bin/bash
[...]
#SBATCH -N 24
#SBATCH --ntasks-per-core=4
#SBATCH -n 96
[...]

srun --overcommit -n 384 ./your-prog
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文