8086锁定别针和ASM锁定前缀的工作原理
我是一名程序员,正在学习汇编语言,以便直观地了解我的代码如何在 CPU 上运行。
当我研究ASM关键字LOCK时,谷歌告诉我CPU将在执行带有LOCK前缀的以下指令时独占数据总线的所有权。
但没有任何额外的信息CPU如何可以独占所有权。
我还发现8086微芯片有一个锁销,其作用与关键字LOCK完全相同。这可能是实现LOCK关键字的逻辑电路。
谁能解释一下锁销的机制。当锁引脚处于活动状态时,其他CPU尝试获取数据总线的使用情况时将如何被拒绝。
I am a programmer and learning assembly language in order to intuitively understand how my code run on the CPU.
While I was studying the ASM keyword LOCK, google told me CPU will take exclusive ownership of data bus while executing the following instruction with LOCK prefix.
But without any extra information how CPU can take exclusive ownership.
I also found that 8086 microchip has a lock pin which do exactly the same thing as keyword LOCK does. This maybe the logic circuit which implements the LOCK keyword.
Can anyone explain the mechanism of the lock pin. While the lock pin is active, how other CPUs will be rejected when try to acquire the usage of data bus.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果唯一的 CPU 锁定了内存总线,则在此期间任何其他设备都无法读取或更改内存内容,即使通过 DMA 也是如此。 (或者对于没有缓存的共享总线上的多个 CPU,同样的处理。)因此,在
lock add [di], ax
的加载和存储之间根本不会发生其他内存操作例如,使其原子化。任何可能的观察者。 (连接到总线的逻辑分析仪除外,这不算数。)半相关:num++ 对于“int num”可以是原子的吗? 描述了
lock
前缀如何在现代 CPU 上为可缓存内存工作,提供 RMW 原子性,没有总线锁,只是在持续时间内挂在缓存行上。我们称之为“缓存锁”;所有现代 CPU 都以这种方式工作以实现对齐的锁定操作,仅对跨越两个缓存行之间边界的 xchg [mem]、ax 之类的内容进行昂贵的总线锁定。这会损害所有核心的吞吐量,而且成本非常昂贵,以至于现代 CPU 有一种方法可以让它始终出现故障,但其他未对齐的加载/存储以及它的性能计数器却不会。
有趣的事实:
xchg [mem], reg
在 386 及更新版本上具有隐式lock
语义。 (这很不幸,因为当寄存器不足时,由于性能原因,它无法用作普通的加载/存储)。它在 286 或更早版本上没有,除非您锁定了 xchg
。这可能与存在 SMP 386 系统(具有原始顺序一致内存模型)这一事实有关。现代 x86 内存模型适用于 486 及更高版本的 SMP 系统。If the only CPU has the memory bus locked, no other device can read or change memory contents during that time, not even via DMA. (Or with multiple CPUs on a shared bus with no cache, same deal.) Therefore, no other memory operations at all can happen between the load and the store of a
lock add [di], ax
for example, making it atomic wrt. any possible observer. (Other than a logic analyzer connected to the bus, which doesn't count.)Semi-related: Can num++ be atomic for 'int num'? describes how the
lock
prefix works on modern CPUs for cacheable memory, providing RMW atomicity without a bus lock, just hanging on to the cache line for the duration.We call this a "cache lock"; all modern CPUs work this way for aligned
locked
operations, only doing an expensive bus lock on something likexchg [mem], ax
that spans a boundary between two cache-lines. That hurts throughput on all cores, and is so expensive that modern CPUs have a way to make that always fault, but not other unaligned loads/stores, as well as performance counters for it.Fun fact:
xchg [mem], reg
has implicitlock
semantics on 386 and newer. (Which is unfortunate because it makes it unusable for performance reasons as just a plain load/store when you're running low on registers). It didn't on 286 or earlier, unless you didlock xchg
. This is possibly related to the fact that there were SMP 386 systems (with a primitive sequentially-consistent memory model). The modern x86 memory model applies to 486 and later SMP systems.