.Net:易失性 32 位与非易失性 64 位和线程安全。处理 64 位的最佳方法是什么?
我知道 volatile 会阻止对变量执行某些(但不是全部)优化。尽管文档在该主题上有点令人困惑(例如,维基百科和 MSDN 矛盾),但我知道易失性正在应用半内存栅栏,这会阻止某些重新排序操作。 (参考阿尔巴哈里)。
我还知道它阻止了寄存器的使用,这意味着读取永远不会因为循环中的变量提升而过时。
我还从经验中知道,编译器对不同的数据类型进行不同的(据我所知未记录)优化,使得该区域有些不可预测。
然而,有些事情对我来说仍然完全不清楚。 64 位值(例如 long)不能用 volatile 修饰。
所以我的问题是如何处理这些变量,以便将它们视为与易失性 32 位值类型等效的变量?
这让我觉得不一致,因为我不认为 32 位和 64 位值应该被区别对待。
编辑
此外,为什么内存屏障保证编译器将编写 ASM 从 RAM 而不是寄存器中获取值?我明白为什么 volatility 会这样做。
I understand that volatile prevents certain (but not all) optimizatons from being performed on variables. Although documentation is a bit confusing on the topic (e.g. Wikipedia & MSDN contradict), I understand that volatile is applying a half memory fence, which prevents certaing reordering operations. (ref. Albahari).
I also understand that it prevents the use of registers, which means that reads can never be stale due to, for example, variable hoisting in loops.
I also know from experience that the compiler berforms different (undocumented AFAIK) optimizations on different data types, making the area somewhat unpredictable.
However, something remains totally unclear to me. 64 bit values such as long cannot be decorated with volatile.
So my question is how are such variables to be handled so that they are treated the equivalent of volatile 32 bit value types?
This strikes me as an inconsistency since I don't believe that 32 and 64 bit values should be treated differently.
EDIT
Further, why does a memory barrier guarantee that the compiler will write ASM that will fetch a value from RAM and not a register? I understand why volatile will do this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
它只是 32 位抖动施加的一种限制,就像常见的 x86 抖动一样。它不能保证变量的对齐比 4 的倍数地址更好。主要是因为堆分配器不能保证更好的工作。并且也没有做出任何努力来保持堆栈地址比 4 更好地对齐。因此,64 位变量可能跨越高速缓存行的边界。因此不可能是原子的。
It is simply a restriction imposed by a 32-bit jitter, like the common x86 one. It cannot guarantee that variables are aligned any better than on an address that's a multiple of 4. Mostly because the heap allocator doesn't promise a better job. And no effort was made either to keep stack addresses aligned better than 4. Accordingly, a 64-bit variable may straddle the boundary of a cache line. And thus can't be atomic.
看看这个伟大的 博客 并遵循文章的结论 ;-)
特别是在 32 系统上访问 64 位 long a 不是原子的!
Look at this great blog from Eric Lippert and follow the conclussion of the article ;-)
Especially on 32 systems access to 64bit longs ais not atomic!
这本来是一个评论,但变成了一个答案。
你不能使用 Thread.MemoryBarrier 来实现类似的事情吗?虽然很烦人,但我认为不能有 volatile long 是合理的,因为编译器不能保证所有平台上的支持;而对于 32 位值则可以。
Thread.MemoryBarrier
是一种在运行时告诉操作系统代表您执行此操作的方法;将问题转移到操作系统;就像 Interlocked 方法不是执行相同操作的机器代码指令的简写一样 - 因为不能保证它们在每个 CPU 上都可用。正如您正确指出的那样,波动性是在许多层面上强制执行的:至少在编译器、运行时和 CPU 上。在这种情况下,.Net 可以保证 32 位值的所有三个值,因为 32 位是普遍存在的。不过,通过 .Net,CPU 支持不一定适用于 64 位;仅勾选 3 个框中的 2 个不足以保证代码仍然有效。
编辑(回应您在问题末尾的附加评论)
我对
MemoryBarrier
的理解是,它是操作系统强制向所有线程发出的信号,以刷新所有写入内存;然后强制当前线程从 RAM 重新加载(回到寄存器),从而确保当前线程具有最新版本的值。This was originally a comment but became an answer.
Couldn't you use
Thread.MemoryBarrier
to achieve a similar thing? Although annoying, I think it's reasonable that you can't have a volatile longs, because the compiler cannot guarantee support on all platforms; whereas with 32-bit values it can.Thread.MemoryBarrier
is a way at runtime to tell the OS to do it on your behalf; offloading the problem to the OS instead; in the same way that theInterlocked
methods are not shorthand for the machine-code instructions that perform the same thing - because they cannot be guaranteed to be available on every CPU.Volatility, as you rightly point out, is enforced on many levels: with compiler, runtime and CPU at least. In this case, .Net can guarantee all three for 32 bit values since 32 bit is ubiquitous. CPU support isn't necessarily available for 64 bit, though, through .Net; and ticking only 2 out of the 3 boxes is not enough to guarantee that the code will still work.
Edit (in response to your additional comment at the end of your question)
My understanding of
MemoryBarrier
is that it is an OS-enforced signal to all threads to flush all writes to RAM; and then forcing the current thread to re-load from RAM (back into registers) thus ensuring that the current thread, has the latest version of a value.