STD :: MUTEX是否强制执行缓存相干？

发布于 2025-01-29 04:17:50 字数 1214 浏览 1 评论 0 原文

我有一个非原子变量 my_var 和 std :: mutex my_mut 。我认为到达代码的这一点，程序员遵循了此规则：

每次程序员修改或写入 my_var 时，他锁定了并解锁 my_mut 。

假设这是 thread1 执行以下操作：

my_mut.lock();
my_var.modify();
my_mut.unlock();

这是我想象的事件的顺序：

prior my_mut.lock（）; ，主存储器和一些本地缓存中可能有多个 my_var 的副本。即使程序员遵守规则，这些价值也不一定同意。
通过指令 my_mut.lock（）; ，所有先前执行的 my_mut 关键部分的写入在此线程中可见。
my_var.modify（）; 执行。
my_mut.unlock（）; ，在主存储器和某些本地caches中可能有多个副本的 my_var 。即使程序员遵守规则，这些价值也不一定同意。 my_var 在此线程结束时的值将可见到下一个线程，该线程锁定 my_mut ，到它锁定 my_mut 时。

我一直很难找到一个来源来验证这正是 std :: mutex 应该工作的方式。我咨询了C ++标准。来自 iso 2013 ，我找到了这一部分：

[注意：例如，获取sutex的呼叫将执行在包含静音的位置获取操作。相应地，发布相同静音的呼叫将执行在相同位置释放操作。非正式地执行对其他内存的副作用释放操作以后执行的其他线程可见的位置在A。
上消费或获取操作

是我对 std :: Mutex 的理解吗？

原文

I have a non-atomic variable my_var and an std::mutex my_mut. I assume up to this point in the code, the programmer has followed this rule:

Each time the programmer modifies or writes to my_var, he locks
and unlocks my_mut.

Assuming this, Thread1 performs the following:

my_mut.lock();
my_var.modify();
my_mut.unlock();

Here is the sequence of events I imagine in my mind:

Prior to my_mut.lock();, there were possibly multiple copies of my_var in main memory and some local caches. These values do not necessarily agree, even if the programmer followed the rule.
By the instruction my_mut.lock();, all writes from the previously executed my_mut critical section are visible in memory to this thread.
my_var.modify(); executes.
After my_mut.unlock();, there are possibly multiple copies of my_var in main memory and some local caches. These values do not necessarily agree, even if the programmer followed the rule. The value of my_var at the end of this thread will be visible to the next thread that locks my_mut, by the time it locks my_mut.

I have been having trouble finding a source that verifies that this is exactly how std::mutex should work. I consulted the C++ standard. From ISO 2013, I found this section:

[ Note: For example, a call that acquires a mutex will perform an
acquire operation on the locations comprising the mutex.
Correspondingly, a call that releases the same mutex will perform a
release operation on those same locations. Informally, performing a
release operation on A forces prior side effects on other memory
locations to become visible to other threads that later perform a
consume or an acquire operation on A.

Is my understanding of std::mutex correct?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一曲爱恨情仇 2025-02-05 04:17:50

C ++在操作之间的关系而不是某些特定的硬件项（例如缓存凝聚力）上运行。因此，C ++标准具有发生在之间的关系，这大致意味着在完成所有副作用之前发生的任何，因此在此之后发生的那一刻可见。

鉴于您有一个独家的关键会话，您输入的意味着，无论发生在此关键部分时，发生在中。因此，进入它的任何结果都将看到之前发生的一切。这就是标准任务。其他所有内容（包括缓存凝聚力）都是实现的职责：它必须确保所描述的行为与实际发生的事情相一致。

回复收藏 0 原文

戏蝶舞 2025-02-05 04:17:50

my_mut.unlock（）; ，在主内存中可能有MY_VAR的多个副本和一些本地caches。这些价值不一定同意，...

硬件已经保持缓存连贯性，因此在现实世界中，不同缓存中的相互矛盾的副本是不可能的。 afaik，没有C ++实现可以运行 std :: thread 跨核，而无需连贯的缓存，并且将来不太可能成为一件事情。有异质系统，例如ARM DSP + MCU，但是您不会在此类内核之间运行一个程序的线程。（而且您不会在此类内核上引导单个操作系统。）

地址将有一个DRAM的值，但是所有CPU内核通过缓存都可以使用，因此值无关紧要：另一个核心的Cache中的修改副本将优先考虑硬件缓存连贯性。

另请参见

https://en.wikipedia.org/wiki/wiki/mesi_protocel 协议。现代CPU不使用共享总线，但是，他们使用目录（例如L3标签）来跟踪哪个核心可能具有任何给定线路的修改副本，因此他们知道要发出信号的核心读书（写信）或股份重点（阅读小姐）的读物发生在一条线上。
，除了Linux内核代码外， memory_order_raxed 用 volaTile 在GCC和Clang上使用，并在需要时使用inline asm inline asm。 volatile 确实有很大的作用，例如 atomic sloadeed 。）
包括在评论中包括讨论 - 实施C ++与手动冲洗的连贯性要求是 em>非常例如，每个版本存储都必须知道要刷新的缓存部分，但是编译器通常不知道哪些变量是共享的。更糟糕的是，在从其他内核中写入之前，肮脏的写下卡切斯需要将其写回，以便我们以后的负载实际上可以看到它们。
http://eel.is/c++++ddraft/c++draft/intraft/intro.races#19 <19 < /a> - [注19：即使两个操作都是放松的载荷，即使两个操作都是放宽的载荷，也有效地禁止将原子操作的编译器重新排序。 这有效地使CACHE连贯保证由C ++原子操作可用的大多数硬件提供。 - end Note] grogrames

grograme在具有非固定共享内存的内核上运行的程序可以将其用于消息通话，消息传播，例如，通过MPI，该程序明确说明了哪些内存区域在何时冲洗。 C ++的多线程内存模型不适合此类系统。这就是为什么主流多CPU系统为 ccnuma> ccnuma ;可以在群集的节点之间找到非连接的共享内存，但这同样是您使用MPI之类的地方，而不是在单独的节点上运行的OS的单独实例上的C ++线程。

After my_mut.unlock();, there are possibly multiple copies of my_var in main memory and some local caches. These values do not necessarily agree, ...

Hardware already maintains cache coherence so conflicting copies in different caches are impossible on real-world systems. AFAIK, there are no C++ implementations that run std::thread across cores without coherent caches, and it's unlikely to be a thing in the future. There are heterogenous systems like ARM DSP + MCU, but you don't run threads of one program between such cores. (And you don't boot a single OS across such cores.)

There will be a value in DRAM for the address, but all CPU cores access memory through cache so that value doesn't matter: a Modified copy in another core's cache will take priority, thanks to hardware cache coherence.