将相同的值写入相同的内存位置是否会导致数据竞争?
考虑以下代码,该代码从多个线程将相同的值写入相同的内存位置:
void f(int* buf, int n, int* p) {
for(int i = 0; i < n; i++)
buf[i] = i;
*p = buf[n/2];
}
void g(int* buf, int n) {
int x1, x2;
thread t1(f, buf, n, &x1);
thread t2(f, buf, n, &x2);
t1.join();
t2.join();
assert(x1 == x2);
}
虽然这很有趣,但我不太关心标准提供的保证,因为我猜它没有提供任何保证。我真正关心的是上述代码在现实世界的多处理器硬件上的行为。 assert
是否总是会通过,还是有可能出现竞争条件、缓存同步问题等?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
存在竞争,但在您的示例中,两个线程都会将相同的值写入相同的地址。由于您没有执行任何读取-修改-写入操作,而只是写入预定的数字,因此在大多数情况下这都是安全的。在大多数系统上,写入 int 将是一条原子指令。例外情况是,如果您在使用指令序列来存储 int 的 8 位微处理器上运行此代码。在这种情况下,它也可能仍然有效,但取决于执行多字节存储的库代码的实现。
There is a race, but in your example both threads will write the same values to the same addresses. Since you are not doing any read-modify-writes, but just writing predetermined numbers, this will be safe in most cases. Writing an int will be an atomic instruction on most systems. The exception would be if you ran this code on a 8-bit microprocessor that uses a sequence of instructions to store an int. In that case it also may still work, but depends on the implementation of the library code that does the multi-byte store.
当一个线程进行的写入的效果可由另一个线程观察到时,与多线程相关的内存模型会受到关注。在您发布的代码中,两个线程都将相同的值写入相同的内存位置,因此哪个线程的 write
buf[n/2]
读取并不重要,两者都可以。现代处理器采用缓存一致性协议,例如 MESI,因此当线程同时写入缓冲区时CPU 之间将发送大量消息来同步保存缓冲区的缓存行,使其运行速度比非并发场景慢得多 (虚假共享效果)。
在这里,写入是否是原子的并不重要,因为两个线程都将相同的值写入相同的内存位置。有一场竞赛,但哪个线程获胜并不重要,因为即使进行部分写入,观察到的值也将是相同的。
Memory models with regards to multi-treading concern when the effects of writes made by one thread are observable by another thread. In the code you posted both threads write the same values into the same memory location, so it doesn't matter which thread's write
buf[n/2]
reads, either will do.Modern processors employ cache coherency protocols, such as MESI, so when the threads write to the buffer concurrently there is going to be a lot of messages sent between the CPUs to synchronize the cache lines holding the buffer making it run much slower than in non-concurrent scenario (false sharing effect).
Here it doesn't matter if the writes are atomic or not, since both threads write the same values to the same memory locations. There is a race, but it doesn't matter which thread wins because the observed values are going to be the same even with partial writes.
正如@Maxim 所说,这里的关键点确实是缓存一致性。在缓存一致性架构中,这确实是不可能的。
然而,在没有缓存一致性的机器上,它可能会出错。我不知道具体的架构,虽然它们由于自然选择而几乎灭绝,但据我所知还有一些剩余的。 (如果您知道示例,请发表评论。)
下面的表格表示两个线程的执行,其中两个线程用 1 填充内存中的归零区域。为简洁起见,此示例按比例缩小了 32 倍,即此处的每个数字代表所讨论的 4 字节 int。缓存行大小为 4 个整数 == 4 个数字。标记为“刷新”的行是将片上高速缓存刷新到主存储器的点。实际上,它是不确定的,因为它可能随时发生,例如由于抢占式任务切换。
所以最后我们得到了一个错误的结果。
我再次强调,这个反例仅在缓存不一致的机器上有效。
The key point here is indeed, as @Maxim said, cache coherency. In a cache coherent architecture it's impossible indeed.
However, it can go wrong on a machine with no cache coherency. I don't know a specific architecture, and although they're almost extinct due to natural selection, as far as I know there are some remaining. (If you know an example, please comment.)
Here is a table that represents an execution of two threads filling a zeroed region in memory with ones. For brevity this example is scaled down by a factor of 32, i.e. each digit here represents a 4-byte int in question. Cache line size is 4 ints == 4 digits. The lines marked as "flush" are points where the on-chip cache is flushed to the main memory. In reality it's non-deterministic, as it may happen at any time, e.g. due to a preemptive task switch.
So we got a wrong result in the end.
I emphasize again that this counter-example is valid only on cache incoherent machines.