x86 上的竞争条件

发布于 2024-11-19 05:29:53 字数 354 浏览 2 评论 0原文

有人可以解释一下这个说法：

shared variables
x = 0, y = 0

Core 1       Core 2
x = 1;       y = 1;
r1 = y;      r2 = x;

x86 处理器上怎么可能有 r1 == 0 和 r2 == 0 ？

来源 Bartosz Milewski 的“并发语言”。

原文

Could someone explain this statement:

shared variables
x = 0, y = 0

Core 1       Core 2
x = 1;       y = 1;
r1 = y;      r2 = x;

How is it possible to have r1 == 0 and r2 == 0 on x86 processors?

Source "The Language of Concurrency" by Bartosz Milewski.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

山田美奈子 2024-11-26 05:29:53

该问题可能是由于涉及指令重新排序的优化而出现的。换句话说，两个处理器都可以在分配变量x和yr1和r2 code>，如果他们发现这会产生更好的性能。这可以通过添加内存屏障来解决，这将强制执行排序约束。

引用您在帖子中提到的幻灯片：

现代多核/语言打破了顺序一致性。

关于 x86 架构，最好阅读的资源是 Intel® 64 和 IA-32架构软件开发人员手册（第8.2 内存排序章节）。第 8.2.1 和 8.2.2 节描述了由
Intel486、奔腾、英特尔酷睿 2 双核、英特尔凌动、英特尔酷睿双核、奔腾 4、英特尔
Xeon 和 P6 系列处理器：称为处理器排序的内存模型，与较旧的 Intel386 架构的程序排序（强排序）相反（其中读和写指令总是按照它们在指令流中出现的顺序发出）。

该手册描述了处理器排序内存模型的许多排序保证（例如负载不会与其他负载重新排序，存储不会与其他存储重新排序，存储是不使用较旧的负载重新排序等），但它也描述了允许的重新排序规则，该规则会导致OP帖子中的竞争条件：

8.2.3.4 负载可能会与早期存储重新排序为不同的
地点

另一方面，如果指令的原始顺序被调换：

shared variables
x = 0, y = 0

Core 1       Core 2
r1 = y;      r2 = x;
x = 1;       y = 1;

在这种情况下，处理器保证 r1 = 1 和 r2 = 1 情况是不允许（由于8.2.3.3 存储不会在早期加载时重新排序保证），这意味着这些指令永远不会在各个内核中重新排序。

要将其与不同的体系结构进行比较，请查看这篇文章：现代微处理器中的内存排序< /a>.您可以看到 Itanium (IA-64) 比 IA-32 架构进行了更多的重新排序：

The problem can arise due to optimizations involving reordering of instructions. In other words, both processors can assign r1 and r2 before assigning variables x and y, if they find that this would yield better performance. This can be solved by adding a memory barrier, which would enforce the ordering constraint.

To quote the slideshow you mentioned in your post:

Modern multicores/languages break sequential consistency.

Regarding the x86 architecture, the best resource to read is Intel® 64 and IA-32 Architectures Software Developer’s Manual (Chapter 8.2 Memory Ordering). Sections 8.2.1 and 8.2.2 describe the memory-ordering implemented by
Intel486, Pentium, Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium 4, Intel
Xeon, and P6 family processors: a memory model called processor ordering, as opposed to program ordering (strong ordering) of the older Intel386 architecture (where read and write instructions were always issued in the order they appeared in the instruction stream).

The manual describes many ordering guarantees of the processor ordering memory model (such as Loads are not reordered with other loads, Stores are not reordered with other stores, Stores are not reordered with older loads etc.), but it also describes the allowed reordering rule which causes the race condition in the OP's post:

8.2.3.4 Loads May Be Reordered with Earlier Stores to Different
Locations

On the other hand, if the original order of the instructions was switched:

shared variables
x = 0, y = 0

Core 1       Core 2
r1 = y;      r2 = x;
x = 1;       y = 1;

In this case, processor guarantees that r1 = 1 and r2 = 1 situation is not allowed (due to 8.2.3.3 Stores Are Not Reordered With Earlier Load guarantee), meaning that those instructions would never be reordered in individual cores.

To compare this with different architectures, check out this article: Memory Ordering in Modern Microprocessors. You can see that Itanium (IA-64) does even more reordering than the IA-32 architecture:

回复收藏 0 原文

救星 2024-11-26 05:29:53

在内存一致性模型较弱的处理器（例如 SPARC、PowerPC、Itanium、ARM 等）上，可能会发生上述情况，因为在没有显式内存屏障指令的情况下，写入时缺乏强制缓存一致性。因此，基本上，Core1 在 y 之前看到 x 上的写入，而 Core2 在 y< 上看到写入/code> 在 x 之前。在这种情况下，不需要完整的栅栏指令……基本上，您只需要在这种情况下强制执行写入或释放语义，以便在对已写入的变量进行读取之前，所有写入都已提交并且对所有处理器可见。写给.具有强大内存一致性模型（例如 x86）的处理器架构通常不需要这样做，但正如 Groo 指出的那样，编译器本身可以重新排序操作。您可以在 C 和 C++ 中使用 volatile 关键字来防止编译器在给定线程中对操作进行重新排序。这并不是说易失性将创建线程安全代码来管理线程之间读写的可见性......这将需要内存屏障。因此，虽然使用 volatile 仍然会创建不安全的线程代码，但在给定线程内，它将在编译的机器代码级别强制执行顺序一致性。