为什么bash的flock在获取锁失败时不会超时退出?
我正在使用 flock
,这是一个用于文件锁定的 bash 命令,以防止代码的两个不同实例多次运行。
我正在使用这个测试代码:
( ( flock -x 200 ; sleep 10 ; echo "original finished" ; ) 200>./test.lock ) &
( sleep 2 ; ( flock -x -w 2 200 ; echo "a finished" ) 200>./test.lock ) &
我正在运行 2 个子 shell(后台)。 (flock NUM; ...) NUM>FILE
语法来自 flock
的手册页。
我预计第一个子 shell 将获得 test.lock 上的独占锁,然后等待 10 秒,然后打印“original finish”,始终持有锁。第二个子 shell 会或多或少同时启动,等待 2 秒,然后尝试获取 test.lock 上的锁,但 2 秒后超时。如果它获得了锁,那么它会打印“a finish”。如果它没有获得锁定,该子 shell 应该停止,并且不应该打印任何内容。
由于第一个子 shell 等待的时间较长,因此它将保持锁定 10 秒,因此第二个子 shell 不应获得锁定,也不应完成。即,人们应该看到“原始完成”打印,并且不两者都打印。
实际发生的情况是打印“完成”,然后打印“原始完成”。
这意味着第二个子 shell 要么 (a) 不使用与第一个子 shell 相同的锁,要么 (b) 它无法获取锁,但继续执行,要么 (c) 其他情况。
为什么这些锁不能按我的预期工作?
I am playing with using flock
, a bash command for file locks to prevent two different instances of the code from running more than once.
I am using this testing code:
( ( flock -x 200 ; sleep 10 ; echo "original finished" ; ) 200>./test.lock ) &
( sleep 2 ; ( flock -x -w 2 200 ; echo "a finished" ) 200>./test.lock ) &
I am running 2 subshells (backgrounded). The (flock NUM; ...) NUM>FILE
syntax is from flock
's man page.
I expect that the first subshell will get an exclusive lock on test.lock, then wait 10 seconds, then print "original finished", all the time holding the lock. The second subshell will start at more or less the same time, wait 2 seconds, then try to get a lock on test.lock, but timeout after 2 seconds. If it gets a lock, then it'll print "a finished". If it doesn't get the lock, that subshell should stop, and nothing should be printed.
Since the first subshell is waiting longer, it will keep the lock for 10 seconds, so the second subshell should not get the lock, and shouldn't finish. i.e. one should see "original finished" printed and not both.
What actually happens is that "a finished" is printed, then "original finished" is printed.
This implies that that the second subshell is either (a) not using the same lock as the first subshell or (b) that it fails to get the lock, but continues to execute or (c) something else.
Why don't those locks work as I expect?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题是,如果
flock
进程未能在超时时间内获得锁,它无法杀死父进程(即产生它的shell) - 它所能做的就是返回一个失败返回码。您需要在继续之前检查返回代码:如此。
您想要的也是
The issue is that, if the
flock
process fails to get the lock within the timeout, it has no way of killing the parent process (i.e. the shell that spawned it) - all it can do is return a failure return code. You need to check that return code before continuing:so
does what you want.