安全发布以及不可变与有效不可变的优势
我正在重读《Java 并发实践》,我不确定我是否完全理解有关不变性和安全发布的章节。
书上说的是:
任何线程都可以安全地使用不可变对象,无需额外的操作 同步,即使同步不用于发布 他们。
我不明白的是,为什么有人(有兴趣使他的代码正确)不安全地发布一些参考文献?
如果对象是不可变的,并且发布不安全,我知道任何其他获取该对象引用的线程都会看到其正确的状态,因为适当的不可变性(使用 final
字段等)提供了保证.)。
但是,如果发布不安全,另一个线程可能仍然会看到 null
或发布后的前一个引用,而不是对不可变对象的引用,在我看来,这似乎是没有人想要的。
如果使用安全发布来确保所有线程都能看到新引用,那么即使该对象只是有效地不可变(没有final
字段,但不可能将它们静音),然后一切就又安全了。正如书中所说:
安全发布的有效不可变对象可以安全地使用 无需额外同步的任何线程。
那么,为什么不变性(相对于有效不变性)如此重要?在什么情况下需要不安全的出版物?
I'm re-reading Java Concurrency In Practice, and I'm not sure I fully understand the chapter about immutability and safe publication.
What the book says is:
Immutable objects can be used safely by any thread without additional
synchronization, even when synchronization is not used to publish
them.
What I don't understand is, why would anyone (interested in making his code correct) publish some reference unsafely?
If the object is immutable, and it's published unsafely, I understand that any other thread obtaining a reference to the object would see its correct state, because of the guarantees offered by proper immutability (with final
fields, etc.).
But if the publication is unsafe, another thread might still see null
or the previous reference after the publication, instead of the reference to the immutable object, which seems to me like something no-one would like.
And if safe publication is used to make sure the new reference is seen by all the threads, then even if the object is just effectively immutable (no final
fields, but no way to mute them), then everything is safe again. As the book says :
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.
So, why is immutability (vs. effective immutability) so important? In what case would an unsafe publication be wanted?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
最好设计不需要同步的对象,原因有两个:
因为上述原因非常重要,所以最好学习有时很困难的规则,并且作为编写者,创建不需要同步的安全对象,而不是希望代码的所有用户都会记住正确使用它。
还要记住,作者并不是说该对象发布不安全,而是在没有同步的情况下安全发布。
至于你的第二个问题,我刚刚检查过,书上并没有向你保证另一个线程总是会看到对更新对象的引用,只是如果它看到了,它将看到一个完整的对象。但我可以想象,如果它是通过另一个(
Runnable
?)对象的构造函数发布的,那就太好了。但这确实有助于解释所有情况。编辑:
有效地不可变和不可变
有效不可变和不可变之间的区别在于,在第一种情况下,您仍然需要以安全的方式发布对象。对于真正不可变的对象,这是不需要的。因此,真正不可变的对象是首选,因为出于我上面所述的原因,它们更容易发布。
It is desirable to design objects that don't need synchronization for two reasons:
Because the above reasons are very important, it is better to learn the sometimes difficult rules and as a writer, make safe objects that don't require synchronization rather than hoping all the users of your code will remember to use it correctly.
Also remember that the author is not saying the object is unsafely published, it is safely published without synchronization.
As for your second question, I just checked, and the book does not promise you that another thread will always see the reference to the updated object, just that if it does, it will see a complete object. But I can imagine that if it is published through the constructor of another (
Runnable
?) object, it will be sweet. That does help with explaining all cases though.EDIT:
effectively immutable and immutable
The difference between effectively immutable and immutable is that in the first case you still need to publish the objects in a safe way. For the truly immutable objects this isn't needed. So truly immutable objects are preferred because they are easier to publish for the reasons I stated above.
我认为主要的一点是真正不可变的对象以后更难被破坏。如果您已声明一个字段
final
,那么它就是final。您必须删除final
才能更改该字段,这应该会引起警报。但是,如果您最初将final
排除在外,那么有人可能会不小心添加一些代码来更改字段,然后繁荣 - 您完蛋了 - 只用了一些 >添加代码(可能在子类中),不对现有代码进行修改。我还假设显式不变性使(JIT)编译器能够进行一些优化,否则很难或不可能证明其合理性。例如,当使用
易失性
字段时,运行时必须保证与写入和读取线程的发生之前关系。实际上,这可能需要内存屏障、禁用无序执行优化等,即性能下降。但是,如果该对象是(深度)不可变的(仅包含对其他不可变对象的final
引用),则可以放宽要求而不破坏任何内容:仅需要通过写入和读取来保证发生之前关系单个引用,而不是整个对象图。因此,显式不变性使程序更简单,这样人类更容易推理和维护,并且计算机更容易以最佳方式执行。随着对象图的增长,这些好处呈指数增长,即对象包含对象,对象包含对象——如果一切都是不可变的,那么一切都很简单。当需要可变性时,将其本地化到严格定义的位置并保持其他所有内容不变仍然可以带来很多好处。
I think the main point is that truly immutable objects are harder to break later on. If you've declared a field
final
, then it's final, period. You would have to remove thefinal
in order to change that field, and that should ring an alarm. But if you've initially left thefinal
out, someone could carelessly just add some code that changes the field, and boom - you're screwed - with only some added code (possibly in a subclass), no modification to existing code.I would also assume that explicit immutability enables the (JIT) compiler to do some optimizations that would otherwise be hard or impossible to justify. For example, when using
volatile
fields, the runtime must guarantee a happens-before relation with writing and reading threads. In practice this may require memory barriers, disabling out-of-order execution optimizations, etc. - that is, a performance hit. But if the object is (deeply) immutable (contains onlyfinal
references to other immutable objects), the requirement can be relaxed without breaking anything: the happens-before relation needs to be guaranteed only with writing and reading the one single reference, not the whole object graph.So, explicit immutability makes the program simpler so that it's both easier for humans to reason and maintain and easier for the computer to execute optimally. These benefits grow exponentially as the object graph grows, i.e. objects contain objects that contain objects - it's all simple if everything is immutable. When mutability is needed, localizing it to strictly defined places and keeping everything else immutable still gives lots of these benefits.
当我读完第 1-3 章时,我遇到了与原始海报完全相同的问题。我认为作者可以更好地阐述这一点。
我认为区别在于,当不安全发布时,有效不可变对象的内部状态可以被观察到处于不一致状态,而不可变对象的内部状态永远不能被观察到处于不一致状态。
不过,我确实认为,如果引用未安全发布,则对不可变对象的引用可能会被观察到已过时/过时。
I had the exact same question as the original poster when finishing reading chapters 1-3 . I think the authors could have done a better job elaborating on this a bit more.
I think the difference lies therein that the internal state of effectively immutable objects can be observed to be in an inconsistent state when they are not safely published whereas the internal state of immutable objects can never be observed to be in an inconsistent state.
However I do think the reference to an immutable object can be observed to be out of date / stale if the reference is not safely published.
“不安全发布”通常适用于希望让其他线程看到写入字段的最新值,但让线程看到较早的值相对无害的情况。一个主要的例子是
String
的缓存哈希值。第一次在String
上调用hashCode()
时,它将计算一个值并将其缓存。如果对同一字符串调用hashCode()
的另一个线程可以看到第一个线程计算的值,则它不必重新计算哈希值(从而节省时间),但不会发生任何不好的情况如果第二个线程没有看到哈希值。它最终只会执行本来可以避免的冗余但无害的计算。让hashCode()
安全地发布哈希值是可能的,但偶尔的冗余哈希计算比安全发布所需的同步要便宜得多。事实上,除了相当长的字符串之外,同步成本可能会抵消缓存带来的任何好处。不幸的是,我认为 Java 的创建者并没有想象到这样的情况:代码会写入一个字段,并且更喜欢它应该对其他线程可见,但如果不是,也不太介意,以及存储到该字段的引用的位置反过来会识别另一个具有相似字段的对象。这导致编写语义正确的代码比可能有效但语义无法得到保证的代码更麻烦并且可能更慢。在某些情况下,除了使用一些无偿的
final
字段来确保内容正确“发布”之外,我不知道有什么真正好的补救措施。"Unsafe publication" is often appropriate in cases where having other threads see the latest value written to a field would be desirable, but having threads see an earlier value would be relatively harmless. A prime example is the cached hash value for
String
. The first timehashCode()
is called on aString
, it will compute a value and cache it. If another thread which callshashCode()
on the same string can see the value computed by the first thread, it won't have to recompute the hash value (thus saving time), but nothing bad will happen if the second thread doesn't see the hash value. It will simply end up performing a redundant-but-harmless computation which could have been avoided. HavinghashCode()
publish the hash value safely would have been possible, but the occasional redundant hash computations are much cheaper than the synchronization required for safe publication. Indeed, except on rather long strings, synchronization costs would probably negate any benefit from caching.Unfortunately, I don't think the creators of Java imagined situations where code would write to a field and prefer that it should be visible to other threads, but not mind too much if it isn't, and where the reference stored to the field would in turn identify another object with a similar field. This leads to situations writing semantically-correct code is much more cumbersome and likely slower than code which would be likely to work but whose semantics would not be guaranteed. I don't know any really good remedy for that in some cases other than using some gratuitous
final
fields to ensure that things get properly "published".