暂停和恢复作业的问题

发布于 2024-11-07 01:47:00 字数 1809 浏览 0 评论 0原文

我有一个驱动程序脚本,它管理一个作业字符串,该字符串可以根据依赖关系图并行或顺序运行作业。例如:

Job              Predecessors

A                null
B                A
C                A
D                B
E                D, C
F                E

驱动程序在后台启动 A,并通过使用 bash 内置 suspend 挂起自身来等待其完成。完成后,作业 A 向驱动程序发送一个 SIGCONT,然后驱动程序将在后台启动 B 和 C,并再次挂起自身,依此类推。

驱动程序有一个set -m,因此启用了作业控制。

当驱动程序本身在后台启动时,这可以正常工作。但是,当在前台调用驱动程序时,第一次调用挂起工作正常第二个调用似乎变成了“退出”,报告“有停止的作业”并且不退出。 第三次调用暂停也会变成“退出”并杀死驱动程序和所有子进程[因为它应该考虑到这是第二次转换调用“exit”]。

这是我的问题:这是预期的行为吗?如果是这样,为什么以及我该如何解决它?

谢谢。

代码片段如下:

驱动程序:

            for step in $(hash_keys 'RUNNING_HASH')
            do
                    proc=$(hash_find 'RUNNING_HASH' $step)
                    if [ $proc ]
                    then
                            # added the grep to ensure the process is found
                            ps -p $proc | grep $proc > /dev/null 2>&1
                            if [ $? -eq 0 ]
                            then
                                    log_msg_to_stderr $SEV_DEBUG "proc $proc running: suspending execution"
                                    suspend 
                                    # execution resumes here on receipt of SIGCONT
                                    log_msg_to_stderr $SEV_DEBUG "signal received: continuing execution"
                                    break
                            fi
                    fi
            done

作业:

## $$ is the driver's PID
kill -SIGCONT $$

I have a driver script which manages a job string which can run jobs in parallel or sequentially based on a dependency graph. For example:

Job              Predecessors

A                null
B                A
C                A
D                B
E                D, C
F                E

The driver starts A in the background and waits for it to complete by suspending itselfusing bash built-in suspend. On completion, job A sends a SIGCONT to the driver which would then start B and C in the background and suspend itself again, and so on.

The driver has a set -m so job control is enabled.

This works fine when the driver is itself started in background. However, when the driver is invoked in the foreground, the first call to suspend works fine. The second call seems to turn into an 'exit' which reports a "There are stopped jobs" and does not exit. The third call to suspend also turns into an 'exit' and kills the driver and all children [as it should considering this is the second converted call to 'exit'].

And this is my question: Is this expected behavior? If so, why and how do I work around it?

Thanks.

Code fragments below:

Driver:

            for step in $(hash_keys 'RUNNING_HASH')
            do
                    proc=$(hash_find 'RUNNING_HASH' $step)
                    if [ $proc ]
                    then
                            # added the grep to ensure the process is found
                            ps -p $proc | grep $proc > /dev/null 2>&1
                            if [ $? -eq 0 ]
                            then
                                    log_msg_to_stderr $SEV_DEBUG "proc $proc running: suspending execution"
                                    suspend 
                                    # execution resumes here on receipt of SIGCONT
                                    log_msg_to_stderr $SEV_DEBUG "signal received: continuing execution"
                                    break
                            fi
                    fi
            done

Job:

## $ is the driver's PID
kill -SIGCONT $

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一江春梦 2024-11-14 01:47:00

我不得不认为你把工作控制和挂起等事情搞得太复杂了。这是一个示例程序,它让 5 个孩子始终运行。它每秒查看是否有人离开(比 ps|grep 更有效,顺便说一句),并在必要时启动一个新的子进程。

#!/usr/bin/bash

set -o monitor
trap "pkill -P $ -f 'sleep 10\.9' >&/dev/null" SIGCHLD

totaljobs=15
numjobs=5
worktime=10
curjobs=0
declare -A pidlist

dojob()
{
  slot=$1
  time=$(echo "$RANDOM * 10 / 32768" | bc -l)
  echo Starting job $slot with args $time
  sleep $time &
  pidlist[$slot]=`jobs -p %%`
  curjobs=$(($curjobs + 1))
  totaljobs=$(($totaljobs - 1))
}

# start
while [ $curjobs -lt $numjobs -a $totaljobs -gt 0 ]
 do
  dojob $curjobs
 done

# Poll for jobs to die, restarting while we have them
while [ $totaljobs -gt 0 ]
 do
  for ((i=0;$i < $curjobs;i++))
   do
    if ! kill -0 ${pidlist[$i]} >&/dev/null
     then
      dojob $i
      break
     fi
   done
   sleep 10.9 >&/dev/null
 done
wait

I have to think you are over-complicating things playing with job control and suspend, etc. Here is an example program which keeps 5 children running at all times. Once a second it looks to see if anyone went away (much more efficiently than ps|grep, BTW) and starts up a new child if necessary.

#!/usr/bin/bash

set -o monitor
trap "pkill -P $ -f 'sleep 10\.9' >&/dev/null" SIGCHLD

totaljobs=15
numjobs=5
worktime=10
curjobs=0
declare -A pidlist

dojob()
{
  slot=$1
  time=$(echo "$RANDOM * 10 / 32768" | bc -l)
  echo Starting job $slot with args $time
  sleep $time &
  pidlist[$slot]=`jobs -p %%`
  curjobs=$(($curjobs + 1))
  totaljobs=$(($totaljobs - 1))
}

# start
while [ $curjobs -lt $numjobs -a $totaljobs -gt 0 ]
 do
  dojob $curjobs
 done

# Poll for jobs to die, restarting while we have them
while [ $totaljobs -gt 0 ]
 do
  for ((i=0;$i < $curjobs;i++))
   do
    if ! kill -0 ${pidlist[$i]} >&/dev/null
     then
      dojob $i
      break
     fi
   done
   sleep 10.9 >&/dev/null
 done
wait
怀里藏娇 2024-11-14 01:47:00

工人的工作完成后会退出吗?如果是这样,不使用 suspend 和 SIGCONT,而是在驱动程序脚本中简单地使用 wait $PIDS 怎么样?

Do the worker jobs exit when they're finished? If so, rather than using suspend and SIGCONT, how about simply using wait $PIDS in the driver script?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文