并发和内存模型

发布于 2024-12-02 02:50:13 字数 327 浏览 3 评论 0原文

我正在观看这个 Herb Sutter 关于 GPGPU 和新的 C++ AMP 库的视频。他正在谈论内存模型,并提到弱内存模型,然后是强内存模型,我认为他指的是读/写顺序等,但我不确定。

谷歌在内存模型上发现了一些有趣的结果(主要是科学论文),但是有人可以解释什么是弱内存模型,什么是强内存模型以及它们与并发的关系?

I'm watching this video by Herb Sutter on GPGPU and the new C++ AMP library. He is talking about memory models and mentions Weak Memory Models and then Strong Memory Models and I think he's referring to read/write ordering etc, but I am however not sure.

Google turns up some interesting results (mostly science papers) on memory models, but can someone explain what is a Weak Memory Model and what is a Strong Memory Model and their relation to concurrency?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

我只土不豪 2024-12-09 02:50:13

在并发性方面,内存模型指定了数据访问的约束,以及一个线程/核心/处理器写入的数据对另一个线程/核心/处理器可见的条件。

术语有些含糊,但基本前提是强内存模型对硬件施加了很多限制,以确保由一个线程/核心进行写入/处理器对其他线程/核心/处理器以明确定义的顺序可见,同时允许程序员最大程度地自由访问数据。

另一方面,弱模型对硬件的限制很少,而是将确保可见性的责任交给了程序员。

最强的内存模型是顺序一致性:所有处理器对所有数据的所有操作形成所有处理器都同意的单一总顺序,该顺序与每个处理器上单独的操作顺序一致。这本质上是每个处理器的操作的交错。

最弱的内存模型不会对处理器看到彼此写入的顺序施加任何限制。同一系统中的不同处理器可能会以不同的顺序进行写入,并且某些处理器可能会在另一个处理器写入同一内​​存地址后很长一段时间内使用自己的缓存中的“陈旧”数据。有时,整个高速缓存行被视为单个单元,因此对高速缓存行上的一个变量的写入将导致其他处理器对该高速缓存行上对第一个处理器尚不可见的其他变量的写入被有效丢弃,如下所示当它最终将缓存行写入内存时,过时的值会被写入顶部。在这种方案下,必须格外小心,以确保使用显式同步指令将数据以正确的顺序传输到其他处理器。

例如,Intel x86 内存模型通常被认为处于较强的一端,因为对于写入对其他处理器可见的顺序有严格的规则,而 DEC Alpha 和 ARM 处理器通常被认为具有较弱的内存模型,因为如果您在代码中显式放置排序指令(内存栅栏或屏障),则来自一个处理器的写入仅需要按特定顺序对其他处理器可见。

某些系统的内存只能由特定处理器访问。因此,在这些处理器之间传输数据需要显式的数据传输指令。 Cell 处理器就是这种情况,GPU 也经常出现这种情况。这可以被视为弱内存模型的一个极端——如果您显式调用数据传输,则数据仅对其他处理器可见。

编程语言通常将自己的内存模型强加于底层处理器提供的任何内容之上。例如,C++0x 指定了一套完整的排序约束,范围从完全宽松到完全顺序一致性,因此您可以在代码中指定您需要的内容。另一方面,Java 有一组非常具体的排序约束,必须遵守并且不能改变。在这两种情况下,编译器都必须将所需的约束转换为底层处理器的相关指令,如果您在弱有序机器上请求顺序一致性,这可能会非常复杂。

In terms of concurrency, a memory model specifies the constraints on data accesses, and the conditions under which data written by one thread/core/processor becomes visible to another.

The terms weak and strong are somewhat ambiguous, but the basic premise is that a strong memory model places a lot of constraints on the hardware to ensure that writes by one thread/core/processor are visible to other threads/cores/processors in clearly-defined orders, whilst allowing the programmer maximum freedom of data access.

On the other hand, a weak model places very little constraints on the hardware, but instead places the responsibility of ensuring visibility in the hands of the programmer.

The strongest memory model is Sequential Consistency: all operations to all data by all processors form a single total order agreed on by all processors, which is consistent with the order of operations on each processor individually. This is essentially an interleaving of the operations of each processor.

The weakest memory model will not impose any restrictions on the order that processors see each other's writes. Different processors in the same system may see writes in different orders, and some processors may use "stale" data from their own cache for a long time after a write to the same memory address by another processor. Sometimes, whole cache lines are treated as a single unit, so a write to one variable on a cache line will cause writes from other processors to other variables on that cache line that are not yet visible to the first processor to be effectively discarded, as the stale values are written over the top when it eventually writes the cache line to memory. Under such a scheme, extreme care must be taken to ensure that data is transferred to other processors in the correct order, using explicit synchronization instructions.

For example, the Intel x86 memory model is generally considered to be on the stronger end, as there are strict rules about the order in which writes become visible to other processors, whereas the DEC Alpha and ARM processors are generally considered to have weak memory models, as writes from one processor are only required to be visible to other processors in a particular order if you explicitly put ordering instructions (memory fences or barriers) in your code.

Some systems have memory that is only accessible by particular processors. Transferring data between these processors therefore requires explicit data transfer instructions. This is the case with the Cell processors, and is often the case with GPUs as well. This can be viewed as an extreme of a weak memory model --- data is only visible to other processors if you explicitly invoke the data transfer.

Programming languages usually impose their own memory models on top of whatever is provided by the underlying processors. For example, C++0x specifies a complete set of ordering constraints ranging from completely relaxed to full sequential consistency, so you can specify in code what you require. On the other hand, Java has a very specific set of ordering constraints that must be adhered to and cannot be varied. In both cases the compiler must translate the desired constraints into the relevant instructions for the underlying processor, which may be quite involved if you request sequential consistency on a weakly ordered machine.

三生殊途 2024-12-09 02:50:13

这两个术语没有明确定义,也不是非黑即白的问题。

内存模型可以非常弱,也可以非常强,或者介于两者之间。

它基本上指的是对并发内存访问提供的保证。

天真地,您会期望在一个线程上进行的写入立即对所有其他线程可见。您会期望事件在所有线程上也以相同的顺序出现。

但在较弱的内存模型中,这些都可能不成立。

顺序一致性是内存模型的术语,它保证在所有线程中以相同的顺序看到事件。所以保证顺序一致性的内存模型是相当强大的。

较弱的保证是因果一致性:保证事件在其所依赖的事件之后被观察到。

换句话说,如果您首先将值x写入某个地址A,然后将第二个值y写入同一地址,则读取x值后,没有线程永远会读取值y。由于两次写入是针对同一地址的,因此如果并非所有线程都遵循相同的顺序,则会违反因果一致性。
但这并没有说明不相关的事件应该发生什么。将第三个值写入不同内存地址的结果绝对可以在任何时间被其他线程观察到(因此不同的线程可能以不同的顺序观察事件,这与顺序一致性不同)

还有很多其他的这种程度的“一致性”,有的强,有的弱,并提供各种微妙的保证,让您知道什么可以信赖。

从根本上说,更强大的内存模型将为观察事件的顺序提供更多保证,并且通常会保证行为更接近您的直觉预期。

但是较弱的模型可以提供更多的优化空间,特别是,它可以在更多核心的情况下更好地扩展(因为需要更少的同步)。

顺序一致性在单核 CPU 上基本上是免费的,在四核 CPU 上也是可行的,但会令人望而却步在 32 核系统或具有 4 个物理 CPU 的系统上价格昂贵。或者多个物理机之间的共享内存系统。

拥有的核心越多,它们之间的距离越远,就越难确保它们都以相同的顺序观察事件。因此,我们做出了妥协,您选择了较弱的内存模型,这会带来更宽松的保证。

The two terms aren't clearly defined, and it's not a black/white thing.

Memory models can be extremely weak, extremely strong, or anywhere in between.

It basically refers to the guarantees offered about concurrent memory accesses.

Naively, you would expect a write made on one thread, to be immediately visible to all other threads. And you would expect events to appear in the same order on all threads as well.

But in a weaker memory model, neither of those may hold.

Sequential consistency is the term for a memory model which guarantees that events are seen in the same order across all threads. So a memory model which ensures sequential consistency is pretty strong.

A weaker guarantee is causal consistency: the guarantee that events are observed after the events they depend on.

In other words, if you first write a value x to some address A, and then write a second value y to the same address, then no thread will ever read the value y after reading the x value. Because the two writes are to the same address, it would violate causal consistency if not all threads observed the same order.
But this says nothing about what should happen to unrelated events. The result of writing a third value to a different memory address could be observed at absolutely any time by other threads (so different threads may observe events in a different order, unlike under sequential consistency)

There are plenty other such levels of "consistency", some stronger, some weaker, and offering all sorts of subtle guarantees about what you can rely on.

Fundamentally, a stronger memory model is going to offer more guarantees about the order in which events are observed, and will normally guarantee behavior closer to what you'd intuitively expect.

But a weaker model allows more room for optimization, and especially, it scales better with more cores (because less synchronization is required)

Sequential consistency is basically free on a single-core CPU, is doable on a quad-core, but would be prohibitively expensive on a 32-core system, or a system with 4 physical CPUs. Or a shared-memory system between multiple physical machines.

The more cores you have, and the further apart they are, the harder it is to ensure that they all observe events in the same order. So compromises are made, and you settle for a weaker memory model which makes looser guarantees.

梅倚清风 2024-12-09 02:50:13

是的,你是对的 - 弱内存模型和强内存模型之间的差异在于可用优化的差异(读/写顺序和相关围栏)。

您可以通过从顺序一致的模型(限制性最强或最强的模型)开始来指定内存模型,然后指定如何相互引入、删除或移动来自单个线程的读取和写入

在此模型(顺序一致)中,内存独立于使用它的任何处理器(线程)。内存通过控制器连接到每个线程,该控制器提供来自每个线程的读写请求。单个线程的读取和写入完全按照线程指定的顺序到达内存,但它们可能会以未指定的方式与其他线程的读取和写入交错

了解多线程应用中低锁定技术的影响

但是强记忆模型和弱记忆模型之间没有确切的界限,除非您考虑与其他模型的顺序一致模型。其中一些只是更强/更弱,因此比其他的更容易通过重新排序进行优化。例如,x86 的 .NET 2.0 中的内存模型允许比 .NET 1.1 中的版本进行更多优化,因此它可以被视为较弱的模型。

Yes, you are right - the difference between Weak and Strong memory models is a difference in what optimizations are available (order of reads/write and related fences).

You can specify a memory model by starting with a sequentially consistent model (the most restrictive, or strongest model), and then specify how reads and writes from a single thread can be introduced, removed, or moved with respect to one another

In this model (sequentially consistent) the memory is independent of any of the processors (threads) that use it. The memory is connected to each of the threads by a controller that feeds read and write requests from each thread. The reads and writes from a single thread reach memory in exactly the order specified by the thread, but they might be interleaved with reads and writes from other threads in an unspecified way

Understand the Impact of Low-Lock Techniques in Multithreaded Apps

However there's no exact bound between strong and weak memory models, unless you consider sequentilly consistent model vs others. Some of them are just stronger/weaker and therefore more open to optimizations by reordering than others. For example, memory model in .NET 2.0 for x86 allows a bit more optimizations that the verison in .NET 1.1 so it can be considered as a weaker model.

唠甜嗑 2024-12-09 02:50:13

Google 在内存模型上发现了一些有趣的结果(主要是科学论文),但是有人可以解释什么是弱内存模型、什么是强内存模型以及它们与并发的关系吗?

强内存模型是这样一种模型:从其他内核的角度来看,读取和写入似乎按照它们在程序中出现的方式发生,特别是按照它们在程序中出现的顺序发生。这称为顺序一致性。

弱内存模型是一种内存执行可能被CPU 更改(例如重新排序)的模型。所有实用的 CPU 架构都允许指令重新排序。

请注意,Herb Sutter 使用“强内存模型”来表示原子内在函数不重新排序的模型。这不是普遍接受的定义。

Google turns up some interesting results (mostly science papers) on memory models, but can someone explain what is a Weak Memory Model and what is a Strong Memory Model and their relation to concurrency?

A strong memory model is one where, from the point of view of other cores, reads and writes appear to happen as they appear in the program and, in particular, in the order in which they appear in the program. This is known as sequential consistency.

A weak memory model is one where memory executions may be changed by the CPU, e.g. reordered. All practical CPU architectures allow instructions to be reordered.

Note that Herb Sutter uses "strong memory model" to mean one where atomic intrinsics are not reordered. This is not the commonly accepted definition.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文