Slurm Job Arrays在ArgParse中使用时不工作
我试图以不同的变量的值 - start_num
一次运行多个(即以并行方式)运行。然后,我设计了以下bash脚本,
#!/bin/bash
#SBATCH --job-name fmriGLM #job name을 다르게 하기 위해서
#SBATCH --nodes=1
#SBATCH -t 16:00:00 # Time for running job
#SBATCH -o /scratch/connectome/dyhan316/fmri_preprocessing/FINAL_loop_over_all/output_fmri_glm.o%j #%j : job id 가 들어가는 것
#SBATCH -e /scratch/connectome/dyhan316/fmri_preprocessing/FINAL_loop_over_all/error_fmri_glm.e%j
pwd; hostname; date
#SBATCH --ntasks=30
#SBATCH --mem-per-cpu=3000MB
#SBATCH --cpus-per-task=5
#SBATCH -a 0-5
python FINAL_ARGPARSE_RUN.py --n_division 30 --start_num $SLURM_ARRAY_TASK_ID
然后,我运行了sbatch -exclude master array_bash_2
,但它不起作用。我尝试搜索许多站点并尝试了多个事情,但是错误final_argparse_run.py:error:error:grognal -start_num:预期一个参数
在错误文件中弹出,使我感到<<代码> $ slurm_array_task_id 在bash脚本中尚未正确完成...?
谁能解释为什么这是什么以及我如何解决?
谢谢你!
I am trying to run multiple things at once (i.e. in a parallel manner) with different values of the variable --start_num
. I have designed the following bash script,
#!/bin/bash
#SBATCH --job-name fmriGLM #job name을 다르게 하기 위해서
#SBATCH --nodes=1
#SBATCH -t 16:00:00 # Time for running job
#SBATCH -o /scratch/connectome/dyhan316/fmri_preprocessing/FINAL_loop_over_all/output_fmri_glm.o%j #%j : job id 가 들어가는 것
#SBATCH -e /scratch/connectome/dyhan316/fmri_preprocessing/FINAL_loop_over_all/error_fmri_glm.e%j
pwd; hostname; date
#SBATCH --ntasks=30
#SBATCH --mem-per-cpu=3000MB
#SBATCH --cpus-per-task=5
#SBATCH -a 0-5
python FINAL_ARGPARSE_RUN.py --n_division 30 --start_num $SLURM_ARRAY_TASK_ID
Then, I ran sbatch --exclude master array_bash_2
, but it doesn't work. I have tried searching many sites and have tried multiple things, but still the error FINAL_ARGPARSE_RUN.py: error: argument --start_num: expected one argument
pops out in the error file, making me feel that the $SLURM_ARRAY_TASK_ID
in the bash script hasn't been properly done...?
Could anyone explain why this is and how I can fix it?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题似乎在您的行中
pwd;主机名;日期
。不要在
#sbatch
指令之间添加任何非批次行,因为Slurm将在此时停止处理,这意味着您不是在提交数组作业,而只是一个工作。在最后一个
#sbatch
行之后移动该行,现在应该工作。The problem seems to be in your line
pwd; hostname; date
.Don’t add any non-SBATCH lines in between
#SBATCH
directives as Slurm will stop processing at that point, meaning you are not submitting an array job, but just a single job.Move that line after the last
#SBATCH
line and it should work now.