如何从多个脚本同步(锁定/解锁)对 bash 中文件的访问?

发布于 2024-08-22 13:55:19 字数 1379 浏览 6 评论 0原文

我正在编写将并行运行并从同一文件获取输入数据的脚本。这些脚本将打开输入文件,读取第一行,将其存储以供进一步处理,最后从输入文件中删除该读取的行。

现在的问题是,多个脚本访问该文件可能会导致两个脚本同时访问输入文件并读取同一行的情况,这会产生该行被处理两次的不可接受的结果。

现在一种解决方案是在访问输入文件之前编写一个锁定文件(.lock_input),然后在释放输入文件时擦除它,但这种解决方案在我的情况下并不吸引人,因为有时 NFS 会变慢网络通信随机且可能没有可靠的锁定。

另一种解决方案是设置进程锁而不是写入文件,这意味着访问输入文件的第一个脚本将启动一个名为 lock_input 的进程,其他脚本将 ps -elf | grep lock_input 。如果它出现在进程列表中,他们将等待。这可能比写入 NFS 更快,但仍然不是完美的解决方案...

所以我的问题是:是否有任何 bash 命令(或其他脚本解释器)或我可以使用的服务,其行为类似于在线程编程中用于同步的信号量或互斥锁?

谢谢。

小粗略的例子:

假设我们有如下的 input_file:

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday 
Sunday

处理脚本:TrScript.sh

#!/bin/bash  
NbLines=$(cat input_file | wc -l)  
while [ ! $NbLines = 0 ]  
do  
  FirstLine=$(head -1 input_file)  
  echo "Hello World today is $FirstLine"  
  RemainingLines=$(expr $NbLines - 1 )  
  tail -n $RemainingLines input_file > tmp  
  mv tmp input_file  
  NbLines=$(cat input_file | wc -l)   
done

主脚本:

#! /bin/bash  
./TrScript.sh &  
./TrScript.sh &  
./TrScript.sh &  
wait

结果应该是:

Hello World today is Monday  
Hello World today is Tuesday  
Hello World today is Wednesday  
Hello World today is Thursday  
Hello World today is Friday  
Hello World today is Saturday  
Hello World today is Sunday

I'm writing scripts that will run in parallel and will get their input data from the same file. These scripts will open the input file, read the first line, store it for further treatment and finally erase this read line from the input file.

Now the problem is that multiple scripts accessing the file can lead to the situation where two scripts access the input file simultaneously and read the same line, which produces the unacceptable result of the line being processed twice.

Now one solution is to write a lock file (.lock_input) before accessing the input file, and then erase it when releasing the input file, but this solution is not appealing in my case because sometimes NFS slows down network communication randomly and may not have reliable locking.

Another solution is to put a process lock instead of writing a file, which means the first script to access the input file will launch a process called lock_input, and the other scripts will ps -elf | grep lock_input. If it is present on the process list they will wait. This may be faster than writing to the NFS but still not perfect solution ...

So my question is: Is there any bash command (or other script interpreter) or a service I can use that will behave like semaphore or mutex locks used for synchronization in thread programming?

Thank you.

Small rough example:

Let's say we have input_file as following:

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday 
Sunday

Treatment script : TrScript.sh

#!/bin/bash  
NbLines=$(cat input_file | wc -l)  
while [ ! $NbLines = 0 ]  
do  
  FirstLine=$(head -1 input_file)  
  echo "Hello World today is $FirstLine"  
  RemainingLines=$(expr $NbLines - 1 )  
  tail -n $RemainingLines input_file > tmp  
  mv tmp input_file  
  NbLines=$(cat input_file | wc -l)   
done

Main script:

#! /bin/bash  
./TrScript.sh &  
./TrScript.sh &  
./TrScript.sh &  
wait

The result should be:

Hello World today is Monday  
Hello World today is Tuesday  
Hello World today is Wednesday  
Hello World today is Thursday  
Hello World today is Friday  
Hello World today is Saturday  
Hello World today is Sunday

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

小鸟爱天空丶 2024-08-29 13:55:19

用于

line=`flock $lockfile -c "(gawk 'NR==1' < $infile ; gawk 'NR>1' < $infile > $infile.tmp ; mv $infile.tmp $infile)"`

访问您要读取的文件。不过,这使用了文件锁。

gawk NR==1 < ...

打印输入的第一行

use

line=`flock $lockfile -c "(gawk 'NR==1' < $infile ; gawk 'NR>1' < $infile > $infile.tmp ; mv $infile.tmp $infile)"`

for accessing the file you want to read from. This uses file locks, though.

gawk NR==1 < ...

prints the first line of the input

つ可否回来 2024-08-29 13:55:19

我一直喜欢 lockfile 程序(lockfile 手册页)来自 procmail 工具集(应该在大多数系统,尽管默认情况下可能不会安装)。

它旨在锁定邮件假脱机文件,这些文件通常通过 NFS 挂载,因此它确实可以通过 NFS 正常工作(尽可能多)。

另外,只要您假设所有“工作人员”都在同一台机器上(假设您可以检查 PID,当 PID 最终换行时,这可能无法正常工作),您可以将锁定文件放在某些位置处理 NFS 服务器上托管的文件时的其他本地目录(例如 /tmp)。只要所有工作人员使用相同的锁定文件位置(以及锁定文件文件名到锁定路径名的一对一映射),它就能正常工作。

I have always liked the lockfile program (sample search result for lockfile manpage) from the procmail set of tools (should be available on most systems, though it might not be installed by default).

It was designed to lock mail spool files, which are (were?) commonly mounted via NFS, so it does work properly over NFS (as much as anything can).

Also, as long as you you are making the assumption that all your ‘workers’ are on the same machine (by assuming you can check for PIDs, which may not work properly when PIDs eventually wrap), you could put your lock file in some other, local, directory (e.g. /tmp) while processing files hosted on an NFS server. As long as all the workers use the same lock file location (and a one-to-one mapping of lockfile filenames to locked pathnames), it will work fine.

偏爱自由 2024-08-29 13:55:19

使用FLOM(免费锁定管理器)工具,您的主脚本可以变得如此简单:

#!/bin/bash  
flom -- ./TrScript.sh &  
flom -- ./TrScript.sh &  
flom -- ./TrScript.sh &  
wait

如果您正在运行单个主机内的脚本,例如:

flom -A 224.0.0.1 -- ./TrScript.sh &

如果您想在许多主机上分发脚本。以下 URL 提供了一些使用示例: http://sourceforge.net/p /flom/wiki/FLOM%20by%20examples/

Using FLOM (Free LOck Manager) tool your main script can become as easy as:

#!/bin/bash  
flom -- ./TrScript.sh &  
flom -- ./TrScript.sh &  
flom -- ./TrScript.sh &  
wait

if you are running the script inside a single host and something like:

flom -A 224.0.0.1 -- ./TrScript.sh &

if you want to distribute your script on many hosts. Some usage examples are available at this URL: http://sourceforge.net/p/flom/wiki/FLOM%20by%20examples/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文