为什么持有自旋锁时不能睡觉?
在linux内核中,为什么不能在持有自旋锁的情况下休眠呢?
In the linux kernel, why can't you sleep while holding a spinlock?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
在linux内核中,为什么不能在持有自旋锁的情况下休眠呢?
In the linux kernel, why can't you sleep while holding a spinlock?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(8)
示例:您的驱动程序正在执行,并且刚刚取出了控制对其设备的访问的锁。保持锁定时,设备会发出中断,这会导致中断处理程序运行。中断处理程序在访问设备之前也必须获得锁。在中断处理程序中取出自旋锁是合法的事情;这是自旋锁操作不休眠的原因之一。但是,如果中断例程与最初取出锁的代码在同一处理器中执行,会发生什么情况?当中断处理程序正在旋转时,非中断代码将无法运行来释放锁。该处理器将永远旋转。
来源:http://www.makelinux.net/ldd3/chp-5 -sect-5.shtml
Example: your driver is executing and has just taken out a lock that controls access to its device. While the lock is held, the device issues an interrupt, which causes your interrupt handler to run. The interrupt handler, before accessing the device, must also obtain the lock. Taking out a spinlock in an interrupt handler is a legitimate thing to do; that is one of the reasons that spinlock operations do not sleep. But what happens if the interrupt routine executes in the same processor as the code that took out the lock originally? While the interrupt handler is spinning, the noninterrupt code will not be able to run to release the lock. That processor will spin forever.
Source: http://www.makelinux.net/ldd3/chp-5-sect-5.shtml
这并不是说你在持有自旋锁时不能睡觉。这样做是一个非常非常糟糕的主意。引用 LDD 的话:
任何像上面提到的死锁都可能导致不可恢复的状态。另一种可能发生的情况是,自旋锁被锁定在一个 CPU 上,然后当线程休眠时,它会在另一个 CPU 上唤醒,从而导致内核恐慌。
回答 Bandicoot 的评论,在自旋锁上下文中,仅在单处理器可抢占内核的情况下才禁用抢占,因为禁用抢占可以有效防止竞争。
http://www.kernel.org/pub /linux/kernel/people/rusty/kernel-locking/index.html
It's not that you can't sleep while holding a spin lock. It is a very very bad idea to do that. Quoting LDD:
Any deadlock like mentioned above may result in an unrecoverable state. Another thing that could happen is that the spinlock gets locked on one CPU, and then when the thread sleeps, it wakes up on the other CPU, resulting in a kernel panic.
Answering Bandicoot's comment, in a spin lock context, pre-emption is disabled only in case of a uniprocessor pre-emptible kernel because disabling pre-emption effectively prevents races.
http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/index.html
我认为这个邮件有一个清晰的答案:
I think this mail has a clarity answer:
我不同意威廉的回应(他的例子)。他混合了两个不同的概念:抢占和同步。
中断上下文可以抢占进程上下文,因此如果两者共享资源,我们需要使用
(1) 禁用 IRQ (2) 获取锁。通过步骤 1,我们可以禁用中断抢占。
我认为这个主题很有说服力。 Sleep() 意味着一个线程/进程将 CPU 和上下文切换的控制权交给另一个线程/进程,而不释放自旋锁,这就是它错误的原因。
I disagree with William's response (his example). He's mixing two different concepts: preemption and synchronization.
An Interrupt Context could preempt a Process Context and thus if there a RESOURCE shared by the both, we need to use
to (1) disable the IRQ (2) acquire the lock. By step 1, we could disable interrupt preemption.
I think this thread is much convincing. Sleep() means a thread/process yields the control of the CPU and CONTEXT SWITCH to another, without releasing the spinlock, that's why it's wrong.
关键点是在Linux内核中,获取自旋锁将禁用抢占。因此,在持有自旋锁时睡眠可能会导致死锁。
例如,线程A获取自旋锁。线程A在释放锁之前不会被抢占。只要线程A快速完成自己的工作并释放锁就没有问题。但是,如果线程 A 在持有锁的情况下休眠,则可以安排线程 B 运行,因为 sleep 函数将调用调度程序。线程 B 也可以获得相同的锁。线程 B 也禁用抢占并尝试获取锁。并且发生死锁。线程 B 永远不会获得锁,因为线程 A 持有该锁,并且线程 A 永远不会运行,因为线程 B 禁用抢占。
为什么首先要禁用抢占?我想这是因为我们不希望其他处理器上的线程等待太久。
The key point is in Linux kernel, acquiring a spin lock will disable preemption. Thus sleeping while holding a spin lock could potentially cause deadlock.
For example, thread A acquires a spin lock. Thread A will not be preempted until it releases the lock. As long as thread A quickly does its job and releases the lock, there is no problem. But if thread A sleeps while holding the lock, thread B could be scheduled to run since the sleep function will invoke the scheduler. And thread B could acquire the same lock as well. Thread B also disables preemption and tries to acquire the lock. And a deadlock occurs. Thread B will never get the lock since thread A holds it, and thread A will never get to run since thread B disables preemption.
And why disabling preemption in the first place? I guess it's because we don't want threads on other processors to wait too long.
另一个可能的解释是,在自旋锁上下文中,抢占被禁用。
Another likely explanation is that, in a spinlock context pre-emption is disabled.
除了 willtate 提到的之外,假设进程在持有 spilock 时处于休眠状态。如果调度的新进程尝试获取相同的自旋锁,它将开始自旋以获得可用的锁。由于新进程不断旋转,因此无法调度第一个进程,因此锁永远不会释放,从而使第二个进程永远旋转,我们遇到了死锁。
Apart from what willtate has mentioned, assume that a process sleeps while holding a spilock. If the new process that is scheduled tries to acquire the same spinlock, it starts spinning for the lock to be available. Since the new process keeps spinning, it is not possible to schedule the first process and thus the lock is never released making the second process to spin for ever and we have a deadlock.
完全同意王楠的观点。
我想最重要的概念是“抢占”和“抢占”。 “调度”以及获取自旋锁时如何发生。
当获取自旋锁时,抢占被禁用(是否正确,我不知道,但假设它是正确的),这意味着计时器中断不能抢占当前自旋锁持有者,但当前自旋锁持有者仍然调用可睡眠内核函数和主动调用调度程序和运行“另一个任务”。
如果“另一个任务”碰巧想要获取与第一个自旋锁持有者相同的自旋锁,那么问题就来了:由于抢占已经被第一个自旋锁持有者禁用,“另一个任务”由第一个自旋锁持有者主动调用调度程序来调用,无法被抢占,所以它的自旋总是占用cpu,这就是死锁发生的原因。
total agree with Nan Wang.
I guess most important concept is "preemption" & "scheduling" and how happen when spinlock is acquired.
when spinlock is acquired, preemption is disabled(true or not, I don't know, but assume it is correct), it means timer interrupt can't preempt current spinlock holder, but current spinlock hold still call sleepable kernel functions & actively invoke scheduler & run "another task".
if "another task" happened to want to acquire the same spinlock as the first spinlock holder, here is problem come: since preemption is already disabled by first spinlock holder, "another task" which is invoked by actively call of scheduler by first spinlock holder, can't be preempted out, so its spinning always take the cpu, this is why deadlock happen.