通过 SLURM 连续运行多个文件,并设置单独的超时
我有一个在 HPC 上运行的 python 脚本,它获取文本文件中的文件列表并启动多个 SBATCH 运行:
./launch_job.sh 0_folder_file_list.txt
launch_job.sh 遍历 0_folder_file_list.txt 并为每个文件启动一个 SBATCH
SAMPLE_LIST=`cut -d "." -f 1 $1`
for SAMPLE in $SAMPLE_LIST
do
echo "Getting accessions from $SAMPLE"
sbatch get_acc.slurm $SAMPLE
#./get_job.slurm $SAMPLE
done
get_job.slurm 包含我的所有 SBATCH 信息,模块加载等并执行
srun --mpi=pmi2 -n 5 python python_script.py ${SAMPLE}.txt
我不想一次启动所有作业,我希望它们以 24 小时的最大运行时间连续运行。我已经将 SBATCH -t 设置为允许最长时间,但我只希望每个作业最多运行 24 小时。我可以设置一个 srun 参数来完成此任务吗?还有别的事吗?
I have a python script I run on HPC that takes a list of files in a text file and starts multiple SBATCH runs:
./launch_job.sh 0_folder_file_list.txt
launch_job.sh goes through 0_folder_file_list.txt and starts an SBATCH for each file
SAMPLE_LIST=`cut -d "." -f 1 $1`
for SAMPLE in $SAMPLE_LIST
do
echo "Getting accessions from $SAMPLE"
sbatch get_acc.slurm $SAMPLE
#./get_job.slurm $SAMPLE
done
get_job.slurm has all of my SBATCH information, module loads, etc. and performs
srun --mpi=pmi2 -n 5 python python_script.py ${SAMPLE}.txt
I don't want to start all of the jobs at one time, I would like them to run consecutively with a 24-hour maximum run time. I have already set my SBATCH -t to allow for a maximum time but I only want each job to run for a maximum of 24-hours. Is there a srun argument I can set that will accomplish this? Something else?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以将
--wait
标志与sbatch
一起使用。在您的情况下,
因此,只有在第一个
sbatch
完成后(您的作业结束或达到时间限制)才会调用下一个sbatch
命令。You can use
--wait
flag withsbatch
.In your case,
So, the next
sbatch
command will only be called after the firstsbatch
finishes (your job ended or time limit reached).