为什么持有自旋锁时不能睡觉?

发布于 2024-10-12 18:14:37 字数 36 浏览 6 评论 0原文

在linux内核中,为什么不能在持有自旋锁的情况下休眠呢?

In the linux kernel, why can't you sleep while holding a spinlock?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

静谧 2024-10-19 18:14:37

示例:您的驱动程序正在执行,并且刚刚取出了控制对其设备的访问的锁。保持锁定时,设备会发出中断,这会导致中断处理程序运行。中断处理程序在访问设备之前也必须获得锁。在中断处理程序中取出自旋锁是合法的事情;这是自旋锁操作不休眠的原因之一。但是,如果中断例程与最初取出锁的代码在同一处理器中执行,会发生什么情况?当中断处理程序正在旋转时,非中断代码将无法运行来释放锁。该处理器将永远旋转。

来源:http://www.makelinux.net/ldd3/chp-5 -sect-5.shtml

Example: your driver is executing and has just taken out a lock that controls access to its device. While the lock is held, the device issues an interrupt, which causes your interrupt handler to run. The interrupt handler, before accessing the device, must also obtain the lock. Taking out a spinlock in an interrupt handler is a legitimate thing to do; that is one of the reasons that spinlock operations do not sleep. But what happens if the interrupt routine executes in the same processor as the code that took out the lock originally? While the interrupt handler is spinning, the noninterrupt code will not be able to run to release the lock. That processor will spin forever.

Source: http://www.makelinux.net/ldd3/chp-5-sect-5.shtml

冷情 2024-10-19 18:14:37

这并不是说你在持有自旋锁时不能睡觉。这样做是一个非常非常糟糕的主意。引用 LDD 的话:

因此,适用于自旋锁的核心规则是任何代码在持有自旋锁时都必须是原子的。它无法入睡;事实上,除了服务中断(有时甚至不这样做)之外,它不能以任何理由放弃处理器。

任何像上面提到的死锁都可能导致不可恢复的状态。另一种可能发生的情况是,自旋锁被锁定在一个 CPU 上,然后当线程休眠时,它会在另一个 CPU 上唤醒,从而导致内核恐慌。

回答 Bandicoot 的评论,在自旋锁上下文中,仅在单处理器可抢占内核的情况下才禁用抢占,因为禁用抢占可以有效防止竞争。

如果内核是在没有 CONFIG_SMP 的情况下编译的,但设置了 CONFIG_PREEMPT,则自旋锁只是禁用抢占,这足以防止任何竞争。对于大多数用途,我们可以将抢占视为等同于 SMP,而不必单独担心它。

http://www.kernel.org/pub /linux/kernel/people/rusty/kernel-locking/index.html

It's not that you can't sleep while holding a spin lock. It is a very very bad idea to do that. Quoting LDD:

Therefore, the core rule that applies to spinlocks is that any code must, while holding a spinlock, be atomic. It cannot sleep; in fact, it cannot relinquish the processor for any reason except to service interrupts (and sometimes not even then).

Any deadlock like mentioned above may result in an unrecoverable state. Another thing that could happen is that the spinlock gets locked on one CPU, and then when the thread sleeps, it wakes up on the other CPU, resulting in a kernel panic.

Answering Bandicoot's comment, in a spin lock context, pre-emption is disabled only in case of a uniprocessor pre-emptible kernel because disabling pre-emption effectively prevents races.

If the kernel is compiled without CONFIG_SMP, but CONFIG_PREEMPT is set, then spinlocks simply disable preemption, which is sufficient to prevent any races. For most purposes, we can think of preemption as equivalent to SMP, and not worry about it separately.

http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/index.html

荒路情人 2024-10-19 18:14:37

我认为这个邮件有一个清晰的答案:

由于自旋锁行为,进程在持有自旋锁时无法被抢占或睡眠。如果进程获取自旋锁并在释放它之前进入睡眠状态。获取自旋锁的第二个进程(或中断处理程序)将忙于等待。在单处理器机器上,第二个进程将锁定CPU,不允许第一个进程唤醒并释放自旋锁,以便第二个进程可以继续,这基本上是一个死锁。

I think this mail has a clarity answer:

A process cannot be preempted nor sleep while holding a spinlock due spinlocks behavior. If a process grabs a spinlock and goes to sleep before releasing it. A second process (or an interrupt handler) that to grab the spinlock will busy wait. On an uniprocessor machine the second process will lock the CPU not allowing the first process to wake up and release the spinlock so the second process can continue, it is basically a deadlock.

这样的小城市 2024-10-19 18:14:37

我不同意威廉的回应(他的例子)。他混合了两个不同的概念:抢占和同步。

中断上下文可以抢占进程上下文,因此如果两者共享资源,我们需要使用

spin_lock_irqsave()

(1) 禁用 IRQ (2) 获取锁。通过步骤 1,我们可以禁用中断抢占。

我认为这个主题很有说服力。 Sleep() 意味着一个线程/进程将 CPU 和上下文切换的控制权交给另一个线程/进程,而不释放自旋锁,这就是它错误的原因。

I disagree with William's response (his example). He's mixing two different concepts: preemption and synchronization.

An Interrupt Context could preempt a Process Context and thus if there a RESOURCE shared by the both, we need to use

spin_lock_irqsave()

to (1) disable the IRQ (2) acquire the lock. By step 1, we could disable interrupt preemption.

I think this thread is much convincing. Sleep() means a thread/process yields the control of the CPU and CONTEXT SWITCH to another, without releasing the spinlock, that's why it's wrong.

空城旧梦 2024-10-19 18:14:37

关键点是在Linux内核中,获取自旋锁将禁用抢占。因此,在持有自旋锁时睡眠可能会导致死锁。

例如,线程A获取自旋锁。线程A在释放锁之前不会被抢占。只要线程A快速完成自己的工作并释放锁就没有问题。但是,如果线程 A 在持有锁的情况下休眠,则可以安排线程 B 运行,因为 sleep 函数将调用调度程序。线程 B 也可以获得相同的锁。线程 B 也禁用抢占并尝试获取锁。并且发生死锁。线程 B 永远不会获得锁,因为线程 A 持有该锁,并且线程 A 永远不会运行,因为线程 B 禁用抢占。

为什么首先要禁用抢占?我想这是因为我们不希望其他处理器上的线程等待太久。

The key point is in Linux kernel, acquiring a spin lock will disable preemption. Thus sleeping while holding a spin lock could potentially cause deadlock.

For example, thread A acquires a spin lock. Thread A will not be preempted until it releases the lock. As long as thread A quickly does its job and releases the lock, there is no problem. But if thread A sleeps while holding the lock, thread B could be scheduled to run since the sleep function will invoke the scheduler. And thread B could acquire the same lock as well. Thread B also disables preemption and tries to acquire the lock. And a deadlock occurs. Thread B will never get the lock since thread A holds it, and thread A will never get to run since thread B disables preemption.

And why disabling preemption in the first place? I guess it's because we don't want threads on other processors to wait too long.

枯叶蝶 2024-10-19 18:14:37

另一个可能的解释是,在自旋锁上下文中,抢占被禁用。

Another likely explanation is that, in a spinlock context pre-emption is disabled.

紫瑟鸿黎 2024-10-19 18:14:37

除了 willtate 提到的之外,假设进程在持有 spilock 时处于休眠状态。如果调度的新进程尝试获取相同的自旋锁,它将开始自旋以获得可用的锁。由于新进程不断旋转,因此无法调度第一个进程,因此锁永远不会释放,从而使第二个进程永远旋转,我们遇到了死锁。

Apart from what willtate has mentioned, assume that a process sleeps while holding a spilock. If the new process that is scheduled tries to acquire the same spinlock, it starts spinning for the lock to be available. Since the new process keeps spinning, it is not possible to schedule the first process and thus the lock is never released making the second process to spin for ever and we have a deadlock.

鱼忆七猫命九 2024-10-19 18:14:37

完全同意王楠的观点。
我想最重要的概念是“抢占”和“抢占”。 “调度”以及获取自旋锁时如何发生。
当获取自旋锁时,抢占被禁用(是否正确,我不知道,但假设它是正确的),这意味着计时器中断不能抢占当前自旋锁持有者,但当前自旋锁持有者仍然调用可睡眠内核函数和主动调用调度程序和运行“另一个任务”。
如果“另一个任务”碰巧想要获取与第一个自旋锁持有者相同的自旋锁,那么问题就来了:由于抢占已经被第一个自旋锁持有者禁用,“另一个任务”由第一个自旋锁持有者主动调用调度程序来调用,无法被抢占,所以它的自旋总是占用cpu,这就是死锁发生的原因。

total agree with Nan Wang.
I guess most important concept is "preemption" & "scheduling" and how happen when spinlock is acquired.
when spinlock is acquired, preemption is disabled(true or not, I don't know, but assume it is correct), it means timer interrupt can't preempt current spinlock holder, but current spinlock hold still call sleepable kernel functions & actively invoke scheduler & run "another task".
if "another task" happened to want to acquire the same spinlock as the first spinlock holder, here is problem come: since preemption is already disabled by first spinlock holder, "another task" which is invoked by actively call of scheduler by first spinlock holder, can't be preempted out, so its spinning always take the cpu, this is why deadlock happen.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文