如何从多个脚本同步(锁定/解锁)对 bash 中文件的访问?
我正在编写将并行运行并从同一文件获取输入数据的脚本。这些脚本将打开输入文件,读取第一行,将其存储以供进一步处理,最后从输入文件中删除该读取的行。
现在的问题是,多个脚本访问该文件可能会导致两个脚本同时访问输入文件并读取同一行的情况,这会产生该行被处理两次的不可接受的结果。
现在一种解决方案是在访问输入文件之前编写一个锁定文件(.lock_input
),然后在释放输入文件时擦除它,但这种解决方案在我的情况下并不吸引人,因为有时 NFS 会变慢网络通信随机且可能没有可靠的锁定。
另一种解决方案是设置进程锁而不是写入文件,这意味着访问输入文件的第一个脚本将启动一个名为 lock_input 的进程,其他脚本将 ps -elf | grep lock_input 。如果它出现在进程列表中,他们将等待。这可能比写入 NFS 更快,但仍然不是完美的解决方案...
所以我的问题是:是否有任何 bash 命令(或其他脚本解释器)或我可以使用的服务,其行为类似于在线程编程中用于同步的信号量或互斥锁?
谢谢。
小粗略的例子:
假设我们有如下的 input_file:
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
处理脚本:TrScript.sh
#!/bin/bash
NbLines=$(cat input_file | wc -l)
while [ ! $NbLines = 0 ]
do
FirstLine=$(head -1 input_file)
echo "Hello World today is $FirstLine"
RemainingLines=$(expr $NbLines - 1 )
tail -n $RemainingLines input_file > tmp
mv tmp input_file
NbLines=$(cat input_file | wc -l)
done
主脚本:
#! /bin/bash
./TrScript.sh &
./TrScript.sh &
./TrScript.sh &
wait
结果应该是:
Hello World today is Monday Hello World today is Tuesday Hello World today is Wednesday Hello World today is Thursday Hello World today is Friday Hello World today is Saturday Hello World today is Sunday
I'm writing scripts that will run in parallel and will get their input data from the same file. These scripts will open the input file, read the first line, store it for further treatment and finally erase this read line from the input file.
Now the problem is that multiple scripts accessing the file can lead to the situation where two scripts access the input file simultaneously and read the same line, which produces the unacceptable result of the line being processed twice.
Now one solution is to write a lock file (.lock_input
) before accessing the input file, and then erase it when releasing the input file, but this solution is not appealing in my case because sometimes NFS slows down network communication randomly and may not have reliable locking.
Another solution is to put a process lock instead of writing a file, which means the first script to access the input file will launch a process called lock_input, and the other scripts will ps -elf | grep lock_input
. If it is present on the process list they will wait. This may be faster than writing to the NFS but still not perfect solution ...
So my question is: Is there any bash command (or other script interpreter) or a service I can use that will behave like semaphore or mutex locks used for synchronization in thread programming?
Thank you.
Small rough example:
Let's say we have input_file as following:
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Treatment script : TrScript.sh
#!/bin/bash
NbLines=$(cat input_file | wc -l)
while [ ! $NbLines = 0 ]
do
FirstLine=$(head -1 input_file)
echo "Hello World today is $FirstLine"
RemainingLines=$(expr $NbLines - 1 )
tail -n $RemainingLines input_file > tmp
mv tmp input_file
NbLines=$(cat input_file | wc -l)
done
Main script:
#! /bin/bash
./TrScript.sh &
./TrScript.sh &
./TrScript.sh &
wait
The result should be:
Hello World today is Monday Hello World today is Tuesday Hello World today is Wednesday Hello World today is Thursday Hello World today is Friday Hello World today is Saturday Hello World today is Sunday
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
用于
访问您要读取的文件。不过,这使用了文件锁。
打印输入的第一行
use
for accessing the file you want to read from. This uses file locks, though.
prints the first line of the input
我一直喜欢 lockfile 程序(lockfile 手册页)来自 procmail 工具集(应该在大多数系统,尽管默认情况下可能不会安装)。
它旨在锁定邮件假脱机文件,这些文件通常通过 NFS 挂载,因此它确实可以通过 NFS 正常工作(尽可能多)。
另外,只要您假设所有“工作人员”都在同一台机器上(假设您可以检查 PID,当 PID 最终换行时,这可能无法正常工作),您可以将锁定文件放在某些位置处理 NFS 服务器上托管的文件时的其他本地目录(例如 /tmp)。只要所有工作人员使用相同的锁定文件位置(以及锁定文件文件名到锁定路径名的一对一映射),它就能正常工作。
I have always liked the lockfile program (sample search result for lockfile manpage) from the procmail set of tools (should be available on most systems, though it might not be installed by default).
It was designed to lock mail spool files, which are (were?) commonly mounted via NFS, so it does work properly over NFS (as much as anything can).
Also, as long as you you are making the assumption that all your ‘workers’ are on the same machine (by assuming you can check for PIDs, which may not work properly when PIDs eventually wrap), you could put your lock file in some other, local, directory (e.g. /tmp) while processing files hosted on an NFS server. As long as all the workers use the same lock file location (and a one-to-one mapping of lockfile filenames to locked pathnames), it will work fine.
使用FLOM(免费锁定管理器)工具,您的主脚本可以变得如此简单:
如果您正在运行单个主机内的脚本,例如:
如果您想在许多主机上分发脚本。以下 URL 提供了一些使用示例: http://sourceforge.net/p /flom/wiki/FLOM%20by%20examples/
Using FLOM (Free LOck Manager) tool your main script can become as easy as:
if you are running the script inside a single host and something like:
if you want to distribute your script on many hosts. Some usage examples are available at this URL: http://sourceforge.net/p/flom/wiki/FLOM%20by%20examples/