不可变对象的线程安全性如何?
每个人都说不可变对象是线程安全的,但这是为什么呢?
以在多核 CPU 上运行的以下场景为例:
- 核心 1 在内存位置
0x100
读取一个对象,并且它缓存在Core 1的L1/L2缓存中; - GC 会在该内存位置收集该对象,因为它已变得符合条件并且
0x100
可用于新对象; - 核心 2 分配一个(不可变)对象,该对象位于地址
0x100
; - 核心 1 获取对此新对象的引用,并在内存位置 0x100 处读取它。
在这种情况下,当 Core 1 请求位置 0x100
处的值时,它是否有可能从 L1/L2 缓存中读取过时的数据?我的直觉表明这里仍然需要一个内存门来确保 Core 1 读取正确的数据。
上述分析是否正确,是否需要内存门,或者我遗漏了什么?
更新:
我在这里描述的情况是每次 GC 进行收集时发生的情况的更复杂版本。当 GC 收集时,内存会重新排序。这意味着该对象所在的物理位置发生了变化,L1/L2 必须失效。大致相同适用于上面的示例。
由于可以合理地期望 .NET 确保在重新排序内存后,不同的内核看到正确的内存状态,因此上述情况也不会成为问题。
Everybody says that immutable objects are thread safe, but why is this?
Take the following scenario running on a multi core CPU:
- Core 1 reads an object at memory location
0x100
and it is cached in the L1/L2 cache of Core 1; - The GC collects this object at that memory location because it has become eligible and
0x100
becomes available for new objects; - Core 2 allocates an (immutable) object which is located at address
0x100
; - Core 1 gets a reference to this new object and reads it at memory location
0x100
.
In this situation, when Core 1 asks for the value at location 0x100
is it possible that it reads the stale data from its L1/L2 cache? My intuition says that a memory gate is still needed here to ensure that Core 1 reads the correct data.
Is the above analysis correct and is a memory gate required, or am I missing something?
UPDATE:
The situation I describe here is a more complex version of what happens every time the GC does a collect. When the GC collects, memory is reordered. This means that the physical location the object was located at changes and that L1/L2 must be invalidated. Roughly the same applies to the example above.
Since it is reasonable to expect that .NET ensures that after reordering memory, different cores see the correct memory state, the above situation will not be a problem too.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
对象的不变性并不是您的场景中的真正问题。相反,您的描述问题围绕着引用、列表或指向该对象的其他系统。当然,需要某种技术来确保旧对象不再可供尝试访问它的线程使用。
不可变对象的线程安全的真正意义在于,您不需要编写一堆代码来实现线程安全。相反,框架、操作系统、CPU(以及其他任何东西)会为您完成工作。
The object's immutability isn't the real question in your scenario. Rather, your description's issue revolves around the reference, list, or other system which points to the object. It would of course need some sort of technique to make sure the old object is no longer availble to the thread which may have tried to access it.
The real point to immutable object's thread safety is that you don't need to write a bunch of code to produce thread safety. Rather the framework, OS, CPU (and whatever else) do the work for you.
我认为您要问的是,在创建对象之后,构造函数返回,并且对它的引用存储在某处,另一个处理器上的线程是否仍然会看到旧数据。作为一种场景,您提供了一种可能性,即保存对象实例数据的缓存行之前曾用于某些其他目的。
在异常弱的内存模型下,这样的事情可能是可能的,但我希望任何有用的内存模型,即使是相对较弱的模型,都将确保取消引用不可变对象是安全的,即使这种安全性需要填充对象足够多以至于不存在缓存行在对象实例之间共享(GC 几乎肯定会在完成后使所有缓存无效,但如果没有这样的填充,由 core #2 创建的不可变对象可能会与 core #1 的对象共享缓存行以前读过)。如果没有至少这一级别的安全性,编写健壮的代码将需要如此多的锁和内存屏障,以至于很难编写不比单处理器代码慢的多处理器代码。
流行的 x86 和 x64 内存型号可以为您提供所需的保证,甚至更进一步。处理器协调缓存行的“所有权”;如果多个处理器想要读取同一高速缓存行,它们可以毫无阻碍地这样做。当处理器想要写入高速缓存行时,它会与其他处理器协商所有权。一旦获得所有权,处理器将执行写入。其他处理器将无法读取或写入该高速缓存行,直到拥有该高速缓存行的处理器放弃该高速缓存行。请注意,如果多个处理器想要同时写入相同的缓存行,它们可能会花费大部分时间来协商缓存行所有权,而不是执行实际工作,但语义正确性将被保留。
I think what you're asking is whether, after an object is created, the constructor returns, and a reference to it is stored somewhere, there is any possibility that a thread on another processor will still see the old data. You offer as a scenario the possibility that a cache line holding instance data for the object was previously used for some other purpose.
Under an exceptionally weak memory model, such a thing might be possible, but I would expect any useful memory model, even a relatively weak one, would ensure that dereferencing an immutable object would be safe, even if such safety required padding objects enough that no cache line be shared between object instances (the GC will almost certainly invalidate all caches when it's done, but without such padding, it would be possible that an immutable object created by core #2 might share a cache line with an object that core #1 had previously read). Without at least that level of safety, writing robust code would require so many locks and memory barriers that it would be hard to write multi-processor code that wasn't slower than single-processor code.
The popular x86 and x64 memory models provide the guarantee you seek, and go much further. Processors coordinate 'ownership' of cache lines; if multiple processors want to read the same cache line, they can do so without impediment. When a processor wants to write a cache line, it negotiates with other processors for ownership. Once ownership is acquired, the processor will perform the write. Other processors will not be able to read or write the cache line until the processor that owns the cache line gives it up. Note that if multiple processors want to write the same cache line simultaneously, they will likely spend most of their time negotiating cache-line ownership rather than performing actual work, but semantic correctness will be preserved.
你忽略了,让这种事情发生的确实是一个糟糕的垃圾收集器。核心 1 上的引用应该可以防止该对象被 GCd。
You're missing that it would be a bad garbage collector indeed that let such a thing happen. The reference on core 1 should have prevented the object from being GCd.
我不确定内存门会改变这种情况,因为这肯定只会影响后续读取......然后问题就变成从哪里读取?如果它来自一个字段(至少必须是静态的,或者某些实例的实例字段仍在堆栈上或以其他方式可达),或局部变量 - 那么根据定义,它不可用于收藏。
对于该引用现在仅在寄存器中的情况......这要棘手得多。直觉上我想说“不,这不是问题”,但需要详细查看内存模型才能证明这一点。但处理引用是一个常见的场景,简单地说:这必须起作用。
I'm not sure that a memory gate would change this scenario, as that would surely only affect subsequent reads... and then the question becomes reads from where? If it is from a field (which must at a minimum be static or an instance fields for some instance still on the stack or otherwise reachable), or local variable - then by definition it isn't available for collection.
Re the scenario where that reference is only now in the registers... that is far trickier. Intuitively I want to say "no that isn't a problem", but it would take a detailed look at the memory model to prove it. But handling references is such a common scenario that simply: this has to work.