当前位置：文江博客话题详情

自旋锁的底层是如何实现的？

发布于 2024-09-02 19:10:47 字数 185 浏览 11 评论 0原文

这是一个可以被持有的锁只有一个执行线程时间。尝试获取锁由另一个执行线程使得后者循环直到锁定已发布。

当两个线程尝试完全相同的时间获取锁时，它如何处理这种情况？

我认为这个问题也适用于各种其他互斥体实现。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一刻暧昧 2024-09-09 19:10:47

正如前一张海报所示，每种现代机器类型都有一类特殊的指令，称为“原子”，它们确实按照前一张海报所示进行操作……它们至少针对指定的内存位置序列化执行。

在 x86 上，有一个 LOCK 汇编器前缀，指示机器应以原子方式处理下一条指令。当遇到该指令时，x86 上实际上会发生一些事情。

挂起的读预取被取消（这意味着 CPU 不会向程序提供可能在原子范围内过时的数据）。
待处理的内存写入将被刷新。
该操作以原子方式执行、保证并针对其他 CPU 进行串行化。在这种情况下，“系列化”意味着“它们一次发生一个”。原子地意味着“该指令的所有部分都在没有任何其他干预的情况下发生”。

对于x86，有两个常用的指令用于实现锁。

CMPXCHG。有条件交换。伪代码：

uint32 cmpxchg(uint32 *memory_location, uint32 old_value, uint32 new_value) {
    atomically {
        if (*memory_location == old_value) 
            *memory_location = new_value;
        return old_value;
    }
}

XCHG。伪代码：

uint32 xchg(uint32 *memory_location, uint32 new_value) {
    atomically {
        uint32 old_value = *memory_location;
        *memory_location = new_value;
        return *old_value;
    }
}

所以，你可以像这样实现一个锁：

uint32 mylock = 0;
while (cmpxchg(&mylock, 0, 1) != 0)
    ;

我们自旋，等待锁，因此，自旋锁。

现在，解锁的指令不会表现出这些良好的行为。根据您使用的机器，使用未锁定的指令，可以观察到各种违反一致性的情况。例如，即使在具有非常友好的内存一致性模型的 x86 上，也可以观察到以下情况：

    Thread 1      Thread 2
    mov [w], 0    mov [x], 0
    mov [w], 1    mov [x], 2
    mov eax, w    mov eax, x
    mov [y], eax  mov [z], eax

在该程序结束时，y 和 z 都可以具有值 0！。

无论如何，最后一点：x86 上的 LOCK 可以应用于 ADD、OR 和 AND，以便为指令获得一致的原子读-修改-写语义。这对于设置标志变量并确保它们不会丢失很重要。如果没有这个，你就会遇到这个问题：

   Thread 1       Thread 2
   AND [x], 0x1   AND [x], 0x2

在这个程序结束时，x 的可能值为 1、2 和 0x1|0x2 (3)。为了获得正确的程序，您需要：

   Thread 1           Thread 2
   LOCK AND [x], 0x1  LOCK AND [x], 0x2

希望这会有所帮助。

As the previous poster indicates, every modern machine type has a special class of instruction known as 'atomics' that do operate as the previous poster indicates... they serialize execution against at least the specified memory location.

On x86, there is a LOCK assembler prefix that indicates to the machine that the next instruction should be handled atomically. When the instruction is encountered, several things effectively happen on x86.

Pending read prefetches are canceled (this means that the CPU won't present data to the program that may be made stale across the atomic).
Pending writes to memory are flushed.
The operation is performed, guaranteed atomically and serialized against other CPUs. In this context, 'serialized' means 'they happen one-at-a-time'. Atomically means "all the parts of this instruction happen without anything else intervening".

For x86, there are two commonly used instructions that are used to implement locks.

CMPXCHG. Conditional exchange. Pseudocode:

uint32 cmpxchg(uint32 *memory_location, uint32 old_value, uint32 new_value) {
    atomically {
        if (*memory_location == old_value) 
            *memory_location = new_value;
        return old_value;
    }
}

XCHG. Pseudocode:

uint32 xchg(uint32 *memory_location, uint32 new_value) {
    atomically {
        uint32 old_value = *memory_location;
        *memory_location = new_value;
        return *old_value;
    }
}

So, you can implement a lock like this:

uint32 mylock = 0;
while (cmpxchg(&mylock, 0, 1) != 0)
    ;

We spin, waiting for the lock, hence, spinlock.

Now, unlocked instructions don't exhibit these nice behaviors. Depending on what machine you're on, with unlocked instructions, all sorts of violations of consistency can be observed. For example, even on x86, which has a very friendly memory consistency model, the following could be observed:

    Thread 1      Thread 2
    mov [w], 0    mov [x], 0
    mov [w], 1    mov [x], 2
    mov eax, w    mov eax, x
    mov [y], eax  mov [z], eax

At the end of this program, y and z can both have the value 0!.

Anyway, one last note: LOCK on x86 can be applied to ADD, OR, and AND, in order to get consistent and atomic read-modify-write semantics for the instruction. This is important for, say, setting flag variables and making sure they don't get lost. Without that, you have this problem:

   Thread 1       Thread 2
   AND [x], 0x1   AND [x], 0x2

At the end of this program, possible values for x are 1, 2, and 0x1|0x2 (3). In order to get a correct program, you need:

   Thread 1           Thread 2
   LOCK AND [x], 0x1  LOCK AND [x], 0x2

Hope this helps.

回复收藏 0 原文

不回头走下去 2024-09-09 19:10:47

取决于处理器和线程实现。大多数处理器都具有可以原子执行的指令，您可以在其之上构建诸如自旋锁之类的东西。例如，IA-32 有一个执行原子交换的 xchg 指令。然后你可以实现一个简单的自旋锁，例如：

  eax = 1;
  while( xchg(eax, lock_address) != 0 );
  // now I have the lock
  ... code ...
  *lock_address = 0; // release the lock

Depends on the processor and the threading implementation. Most processors have instructions that can be executed atomically, on top of which you can build things like spin locks. For example IA-32 has an xchg instruction that does an atomic swap. You can then implement a naive spinlock like:

  eax = 1;
  while( xchg(eax, lock_address) != 0 );
  // now I have the lock
  ... code ...
  *lock_address = 0; // release the lock

回复收藏 0 原文

~没有更多了~

关于作者

梦里兽

暂无简介

文章

342 人气

关注发私信

友情链接

文江博客

自旋锁的底层是如何实现的？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

浪子阿飞

JK.Yang

人间不值得

静待花开

只涨不跌

污浊的双黑

友情链接

自旋锁的底层是如何实现的？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

浪子阿飞

JK.Yang

人间不值得

静待花开

只涨不跌

污浊的双黑

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。