将 qsub (sge) 与多线程应用程序结合使用
我想向我正在使用的集群网络提交一个多线程作业 - 但是关于 qsub 的手册页并不清楚这是如何完成的 - 默认情况下,我猜它只是将其作为正常作业发送,而不管多线程如何 - 但这可能会导致问题,即将许多多线程作业发送到同一台计算机,放慢速度。
有谁知道如何做到这一点?谢谢。
批处理服务器系统是sge。
i wanted to submit a multi-threaded job to the cluster network i'm working with -
but the man page about qsub is not clear how this is done - By default i guess it just sends it as a normal job regardless of the multi-threading - but this might cause problems, i.e. sending many multi-threaded jobs to the same computer, slowing things down.
Does anyone know how to accomplish this? thanks.
The batch server system is sge.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在 SGE/UGE 中,配置是由管理员设置的,因此您必须检查他们所谓的并行环境,
在配置 qsub 中查找具有
$pe_slots
的环境以及您想要的核心数量 如果您使用
mpi,您对配置分配规则(上面的
$pe_slots
)有更多选择,例如$round_robin
和$fill_up
,但这应该能让你继续下去。In SGE/UGE the configuration is set by the administrator so you have to check what they've called the parallel environments
look for one with
$pe_slots
in the configqsub with that environment and number of cores you want to use
If you're using mpi you have more choices for the config allocation rule (
$pe_slots
above) like$round_robin
and$fill_up
, but this should get you going.如果您的工作是多线程的,即使在 SGE 中您也可以利用多线程的优势。在SGE中,单个作业可以使用一个或多个CPU。如果您提交使用单处理器的作业,并且程序中的线程数量超出了单处理器的处理能力,则会出现问题。验证您的作业正在使用多少个处理器以及您的程序正在为每个 CPU 创建多少个线程。
就我而言,我有一个java程序,它使用一个处理器和两个线程,它的工作效率非常高。我提交相同的java程序来执行到许多CPU,每个CPU有2个线程,以使其并行,因为我没有使用MPI。
If your job is multithreaded, you can harness the advantage of multithreading even in SGE. In SGE single job can use one or many CPUs. If you submit a job that uses single processor and you have a program that have many threads than a single processor can handle, problem occurs. Verify how many processors your job is using and how many threads per CPU your program is creating.
In my case i have a java program that uses one processor with two threads, it works pretty efficiently. i submit same java program for execution to many CPUs with 2 threads each to make it parallel as i have not use MPI.
用户“j_m”的回答非常有帮助,但就我而言,我需要请求多个核心并将我的作业提交到特定节点。经过大量搜索后,我终于找到了一个适合我的解决方案,我将其发布在这里,以便其他可能遇到类似问题的人不必经历同样的痛苦(请注意,我是将此作为答案而不是回复,因为我没有足够的声誉进行回复):
我认为变量 $NODE_NAME、$N_OF_CORES 和 $SCRIPT_NAME 非常简单。您可以按照“j_m”的答案轻松找到$ENV_NAME。
The answer by the user "j_m" is very helpful, but in my case I needed to both request multiple cores AND submit my job to a specific node. After a copious amount of searching, I finally found a solution that worked for me and I'm posting it here so that other people who might have a similar problem don't have to go through the same pain (please note that I'm leaving this as an answer instead of a reply because I don't have enough reputation for making replies):
I think the variables $NODE_NAME, $N_OF_CORES and $SCRIPT_NAME are pretty straightforward. You can easily find $ENV_NAME by following the answer by "j_m".