多线程BASH编程——通用方法?

发布于 2024-08-10 20:02:33 字数 1681 浏览 5 评论 0原文

好的,我在所有演示上运行 POV-Ray,但 POV 仍然是单线程的,不会使用多个核心。所以,我开始考虑 BASH 中的解决方案。

我编写了一个通用函数,它接受命令列表并在指定数量的子 shell 中运行它们。这实际上有效,但我不喜欢它以线程安全 多进程方式处理访问下一个命令的方式:

  • 它需要一个文件作为参数对于命令(每行 1 个),
  • 要获取“下一个”命令,每个进程(“线程”)将:
    • 等待直到可以创建锁定文件,其中:ln $CMDFILE $LOCKFILE
    • 从文件中读取命令,
    • 通过删除第一行来修改 $CMDFILE,
    • 删除 $LOCKFILE。

是否有更简洁的方法来执行此操作? 我无法让子 shell 正确地从 FIFO 读取一行。


Incidentally, the point of this is to enhance what I can do on a BASH command line, and not to find non-bash solutions. I tend to perform a lot of complicated tasks from the command line and want another tool in the toolbox.

同时,这是处理从文件中获取下一行的函数。正如您所看到的,它每次读取/删除一行时都会修改磁盘上的文件。这看起来有些骇人听闻,但我没有想出更好的办法,因为在 bash 中如果没有 setvbuf() 的话 FIFO 就无法工作。

#
# Get/remove the first line from FILE, using LOCK as a semaphore (with
# short sleep for collisions).  Returns the text on standard output,
# returns zero on success, non-zero when file is empty.
#
parallel__nextLine() 
{
  local line rest file=$1 lock=$2

  # Wait for lock...
  until ln "${file}" "${lock}" 2>/dev/null
  do sleep 1
     [ -s "${file}" ] || return $?
  done

  # Open, read one "line" save "rest" back to the file:
  exec 3<"$file"
  read line <&3 ; rest=$(cat<&3)
  exec 3<&-

  # After last line, make sure file is empty:
  ( [ -z "$rest" ] || echo "$rest" ) > "${file}"

  # Remove lock and 'return' the line read:
  rm -f "${lock}"
  [ -n "$line" ] && echo "$line"
}

Ok, I was running POV-Ray on all the demos, but POV's still single-threaded and wouldn't utilize more than one core. So, I started thinking about a solution in BASH.

I wrote a general function that takes a list of commands and runs them in the designated number of sub-shells. This actually works but I don't like the way it handles accessing the next command in a thread-safe multi-process way:

  • It takes, as an argument, a file with commands (1 per line),
  • To get the "next" command, each process ("thread") will:
    • Waits until it can create a lock file, with: ln $CMDFILE $LOCKFILE
    • Read the command from the file,
    • Modifies $CMDFILE by removing the first line,
    • Removes the $LOCKFILE.

Is there a cleaner way to do this? I couldn't get the sub-shells to read a single line from a FIFO correctly.


Incidentally, the point of this is to enhance what I can do on a BASH command line, and not to find non-bash solutions. I tend to perform a lot of complicated tasks from the command line and want another tool in the toolbox.

Meanwhile, here's the function that handles getting the next line from the file. As you can see, it modifies an on-disk file each time it reads/removes a line. That's what seems hackish, but I'm not coming up with anything better, since FIFO's didn't work w/o setvbuf() in bash.

#
# Get/remove the first line from FILE, using LOCK as a semaphore (with
# short sleep for collisions).  Returns the text on standard output,
# returns zero on success, non-zero when file is empty.
#
parallel__nextLine() 
{
  local line rest file=$1 lock=$2

  # Wait for lock...
  until ln "${file}" "${lock}" 2>/dev/null
  do sleep 1
     [ -s "${file}" ] || return $?
  done

  # Open, read one "line" save "rest" back to the file:
  exec 3<"$file"
  read line <&3 ; rest=$(cat<&3)
  exec 3<&-

  # After last line, make sure file is empty:
  ( [ -z "$rest" ] || echo "$rest" ) > "${file}"

  # Remove lock and 'return' the line read:
  rm -f "${lock}"
  [ -n "$line" ] && echo "$line"
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

怕倦 2024-08-17 20:02:33
#adjust these as required
args_per_proc=1 #1 is fine for long running tasks
procs_in_parallel=4

xargs -n$args_per_proc -P$procs_in_parallel povray < list

请注意,coreutils 即将推出的 nproc 命令将自动确定
然后可以传递给 -P 的可用处理单元的数量

#adjust these as required
args_per_proc=1 #1 is fine for long running tasks
procs_in_parallel=4

xargs -n$args_per_proc -P$procs_in_parallel povray < list

Note the nproc command coming soon to coreutils will auto determine
the number of available processing units which can then be passed to -P

七度光 2024-08-17 20:02:33

如果您需要真正的线程安全,我建议迁移到更好的脚本系统。

例如,使用 python,您可以使用信号量/队列创建具有安全同步的真实线程。

If you need real thread safety, I would recommend to migrate to a better scripting system.

With python, for example, you can create real threads with safe synchronization using semaphores/queues.

独守阴晴ぅ圆缺 2024-08-17 20:02:33

很抱歉这么久之后才出现这个问题,但我为这个 IMO 拼凑了一个相当好的解决方案
它不能完美地工作,但它会将脚本限制为运行一定数量的子任务,然后等待所有剩余的任务结束。

#!/bin/bash

pids=()
thread() {
  local this
  while [ ${#} -gt 6 ]; do
    this=${1}
    wait "$this"
    shift
  done
  pids=($1 $2 $3 $4 $5 $6)
}
for i in 1 2 3 4 5 6 7 8 9 10
do
  sleep 5 &
  pids=( ${pids[@]-} $(echo $!) )
  thread ${pids[@]}
done
for pid in ${pids[@]}
do
  wait "$pid"
done

它似乎非常适合我正在做的事情(一次处理一堆文件的并行上传)并防止它破坏我的服务器,同时仍然确保在完成脚本之前上传所有文件

sorry to bump this after so long, but I pieced together a fairly good solution for this IMO
It doesnt work perfectly, but it will limit the script to a certain number of child tasks running, and then wait for all the rest at the end.

#!/bin/bash

pids=()
thread() {
  local this
  while [ ${#} -gt 6 ]; do
    this=${1}
    wait "$this"
    shift
  done
  pids=($1 $2 $3 $4 $5 $6)
}
for i in 1 2 3 4 5 6 7 8 9 10
do
  sleep 5 &
  pids=( ${pids[@]-} $(echo $!) )
  thread ${pids[@]}
done
for pid in ${pids[@]}
do
  wait "$pid"
done

it seems to work great for what I'm doing (handling parallel uploading of a bunch of files at once) and keeps it from breaking my server, while still making sure all the files get uploaded before it finishes the script

三岁铭 2024-08-17 20:02:33

我相信你实际上是在这里分叉进程,而不是线程。我建议寻找不同脚本语言的线程支持,例如 Perl、Python 或 ruby​​。

I believe you're actually forking processes here, and not threading. I would recommend looking for threading support in a different scripting language like perl, python, or ruby.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文