线程系统如何应对不同 cpu 缓存的共享数据?

发布于 2024-07-26 11:32:38 字数 361 浏览 4 评论 0原文

我主要有 C++ 背景,但我认为这个问题适用于任何语言的线程。 场景如下:

  1. 我们有两个线程(ThreadA 和 ThreadB),共享内存中有一个值 x

  2. 假设对 x 的访问由互斥体(或其他合适的同步控制)适当地控制

  3. 如果线程碰巧在不同的处理器上运行,如果 ThreadA 执行写操作,但其处理器将结果放入 L2 缓存而不是主内存中,会发生什么情况? 那么,如果 ThreadB 尝试读取该值,它是否不会只查看自己的 L1/L2 缓存/主内存,然后使用那里的旧值?

如果不是这种情况,那么如何处理这个问题?

如果是这样的话,那该怎么办呢?

I'm coming largely from a c++ background, but I think this question applies to threading in any language. Here's the scenario:

  1. We have two threads (ThreadA and ThreadB), and a value x in shared memory

  2. Assume that access to x is appropriately controlled by a mutex (or other suitable synchronization control)

  3. If the threads happen to run on different processors, what happens if ThreadA performs a write operation, but its processor places the result in its L2 cache rather than the main memory? Then, if ThreadB tries to read the value, will it not just look in its own L1/L2 cache / main memory and then work with whatever old value was there?

If that's not the case, then how is this issue managed?

If that is the case, then what can be done about it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

时间海 2024-08-02 11:32:38

你的例子就可以正常工作。

多个处理器使用一致性协议,例如MESI 以确保数据在缓存之间保持同步。 使用 MESI,每个高速缓存行都被视为已修改、独占、在 CPU 之间共享或无效。 写入在处理器之间共享的高速缓存行会强制它在其他 CPU 中变得无效,从而保持高速缓存同步。

然而,这还不够。 不同的处理器具有不同的内存模型,并且大多数现代处理器都支持某种程度的重新排序内存访问。 在这些情况下,需要内存屏障

例如,如果您有线程 A:

DoWork();
workDone = true;

和线程 B:

while (!workDone) {}
DoSomethingWithResults()

两者都在单独的处理器上运行,则无法保证在写入 workDone 和 DoSomethingWithResults() 可能继续之前,线程 B 可以看到 DoWork() 中完成的写入不一致的状态。 内存屏障保证了读取和写入的某种顺序 - 在线程 A 中的 DoWork() 之后添加内存屏障将强制 DoWork 完成的所有读取/写入在写入 workDone 之前完成,以便线程 B 获得一致的视图。 互斥体本质上提供了内存屏障,因此读/写无法传递对锁定和解锁的调用。

在您的情况下,一个处理器会向其他处理器发出信号,表明它弄脏了缓存行,并强制其他处理器从内存中重新加载。 获取互斥体来读取和写入值可保证内存更改按预期顺序对其他处理器可见。

Your example would work just fine.

Multiple processors use a coherency protocol such as MESI to ensure that data remains in sync between the caches. With MESI, each cache line is considered to be either modified, exclusively held, shared between CPU's, or invalid. Writing a cache line that is shared between processors forces it to become invalid in the other CPU's, keeping the caches in sync.

However, this is not quite enough. Different processors have different memory models, and most modern processors support some level of re-ordering memory accesses. In these cases, memory barriers are needed.

For instance if you have Thread A:

DoWork();
workDone = true;

And Thread B:

while (!workDone) {}
DoSomethingWithResults()

With both running on separate processors, there is no guarantee that the writes done within DoWork() will be visible to thread B before the write to workDone and DoSomethingWithResults() would proceed with potentially inconsistent state. Memory barriers guarantee some ordering of the reads and writes - adding a memory barrier after DoWork() in Thread A would force all reads/writes done by DoWork to complete before the write to workDone, so that Thread B would get a consistent view. Mutexes inherently provide a memory barrier, so that reads/writes cannot pass a call to lock and unlock.

In your case, one processor would signal to the others that it dirtied a cache line and force the other processors to reload from memory. Acquiring the mutex to read and write the value guarantees that the change to memory is visible to the other processor in the order expected.

临走之时 2024-08-02 11:32:38

大多数锁定原语(例如互斥锁)都意味着内存屏障。 这些强制发生缓存刷新和重新加载。

例如,

ThreadA {
    x = 5;         // probably writes to cache
    unlock mutex;  // forcibly writes local CPU cache to global memory
}
ThreadB {
    lock mutex;    // discards data in local cache
    y = x;         // x must read from global memory
}

Most locking primitives like mutexes imply memory barriers. These force a cache flush and reload to occur.

For example,

ThreadA {
    x = 5;         // probably writes to cache
    unlock mutex;  // forcibly writes local CPU cache to global memory
}
ThreadB {
    lock mutex;    // discards data in local cache
    y = x;         // x must read from global memory
}
诗化ㄋ丶相逢 2024-08-02 11:32:38

一般来说,编译器理解共享内存,并花费相当大的努力来确保共享内存放置在可共享的位置。 现代编译器对操作和内存访问进行排序的方式非常复杂。 他们倾向于理解线程和共享内存的本质。 这并不是说它们是完美的,但总的来说,编译器已经解决了大部分问题。

In general, the compiler understands shared memory, and takes considerable effort to assure that shared memory is placed in a sharable place. Modern compilers are very complicated in the way that they order operations and memory accesses; they tend to understand the nature of threading and shared memory. That's not to say that they're perfect, but in general, much of the concern is taken care of by the compiler.

辞取 2024-08-02 11:32:38

C# 对此类问题有一些内置支持。
您可以使用 volatile 关键字标记变量,这会强制它在所有 cpu 上同步。

public static volatile int loggedUsers;

另一部分是 .NET 方法的语法包装器,称为 Threading.Monitor.Enter(x) 和 Threading.Monitor.Exit(x),其中 x 是要锁定的变量。 这会导致其他尝试锁定 x 的线程必须等待,直到锁定线程调用 Exit(x)。

public list users;
// In some function:
System.Threading.Monitor.Enter(users);
try {
   // do something with users
}
finally {
   System.Threading.Monitor.Exit(users);
}

C# has some build in support for this kind of problems.
You can mark an variable with the volatile keyword, which forces it to be synchronized on all cpu's.

public static volatile int loggedUsers;

The other part is a syntactical wrappper around the .NET methods called Threading.Monitor.Enter(x) and Threading.Monitor.Exit(x), where x is an variable to lock. This causes other threads trying to lock x to have to wait untill the locking thread calls Exit(x).

public list users;
// In some function:
System.Threading.Monitor.Enter(users);
try {
   // do something with users
}
finally {
   System.Threading.Monitor.Exit(users);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文