将对象重新放入 ConcurrentHashMap 是否会导致“发生在”之前?记忆关系?

发布于 2024-12-10 07:48:49 字数 1320 浏览 0 评论 0原文

我正在使用具有 ConcurrentHashMap 形式的对象存储的现有代码。映射内存储了可供多个线程使用的可变对象。根据设计,没有两个线程会尝试同时修改一个对象。我关心的是线程之间修改的可见性。

目前,对象的代码在“setter”上具有同步(由对象本身保护)。 “getter”上没有同步,成员也不是易失性的。对我来说,这意味着无法保证可见性。但是,当对象被修改时,它会重新放置回到映射中(再次调用put()方法,相同的键)。这是否意味着当另一个线程将对象从映射中拉出时,它会看到修改?

我在 stackoverflow、JCIP 以及 java.util.concurrent 的包描述中对此进行了研究。我认为我基本上让自己感到困惑......但导致我问这个问题的最后一根稻草来自包装说明,它指出:

将对象放入任何并发集合之前的线程中的操作发生在另一个线程中从集合中访问或删除该元素之后的操作。

关于我的问题,“操作”是否包括在 re-put() 之前对存储在地图中的对象进行的修改?如果所有这些确实导致跨线程的可见性,那么这是一种有效的方法吗?我对线程比较陌生,非常感谢您的评论。

编辑:

感谢大家的回复!这是我在 StackOverflow 上的第一个问题,它对我很有帮助。

我必须同意 ptomli 的答案,因为我认为它最清楚地解决了我的困惑。也就是说,在这种情况下,建立“发生之前”关系并不一定会影响修改可见性。关于我在文中描述的实际问题,我的“标题问题”的结构很差。 ptomli 的答案现在与我在 JCIP:“为了确保所有线程都能看到共享可变变量的最新值,读取和写入线程必须在公共锁上同步”(第 37 页)。将对象重新放回到映射中不会为修改插入的对象的成员提供这种公共锁。

我很欣赏所有关于改变的技巧(不可变的对象等),并且我完全同意。但对于这种情况,正如我提到的,由于仔细的线程处理,没有并发修改。一个线程修改一个对象,然后另一个线程读取该对象(CHM 是对象传送器)。我认为 CHM 不足以确保稍后执行的线程将看到我提供的情况下第一次执行的修改。不过,我认为你们中的许多人都正确回答了标题问题

I'm working with existing code that has an object store in the form of a ConcurrentHashMap. Within the map are stored mutable objects, use by multiple threads. No two threads try to modify an object at once by design. My concern is regarding the visibility of the modifications between the threads.

Currently the objects' code has synchronization on the "setters" (guarded by the object itself). There is no synchronization on the "getters" nor are the members volatile. This, to me, would mean that visibility is not guaranteed. However, when an object is modified it is re-put back into the map (the put() method is called again, same key). Does this mean that when another thread pulls the object out of the map, it will see the modifications?

I've researched this here on stackoverflow, in JCIP, and in the package description for java.util.concurrent. I've basically confused myself I think... but the final straw that caused me to ask this question was from the package description, it states:

Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

In relation to my question, do "actions" include the modifications to the objects stored in the map before the re-put()? If all this does result in visibility across threads, is this an efficient approach? I'm relatively new to threads and would appreciate your comments.

Edit:

Thank you all for you responses! This was my first question on StackOverflow and it has been very helpful to me.

I have to go with ptomli's answer because I think it most clearly addressed my confusion. To wit, establishing a "happens-before" relation doesn't necessarily affect modification visibility in this case. My "title question" is poorly constructed regarding my actual question described in the text. ptomli's answer now jives with what I read in JCIP: "To ensure all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock" (page 37). Re-putting the object back into the map doesn't provide this common lock for the modification to the inserted object's members.

I appreciate all the tips for change (immutable objects, etc), and I wholeheartedly concur. But for this case, as I mentioned there is no concurrent modification because of careful thread handling. One thread modifies an object, and another thread later reads the object (with the CHM being the object conveyer). I think the CHM is insufficient to ensure that the later executing thread will see the modifications from the first given the situation I provided. However, I think many of you correctly answered the title question.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

假装爱人 2024-12-17 07:48:49

每次写入对象后,您都会调用concurrHashMap.put。但是,您没有指定在每次读取之前也调用 concurrHashMap.get 。这是必要的。

所有形式的同步都是如此:您需要在两个线程中都有一些“检查点”。只同步一个线程是没有用的。

我还没有检查 ConcurrentHashMap 的源代码来确保 putget 触发发生之前,但它们应该这样做是合乎逻辑的。

但是,即使您同时使用 putget,您的方法仍然存在问题。当您修改一个对象并且该对象在 put 之前被另一个线程使用(处于不一致的状态)时,就会出现问题。这是一个微妙的问题,因为您可能认为旧值会被读取,因为它尚未被 put 并且不会引起问题。问题是,当您不同步时,不能保证获得一致的旧对象,而是行为未定义。 JVM 可以随时更新其他线程中对象的任何部分。仅当使用某些显式同步时,您才能确定在线程之间以一致的方式更新值。

你可以做什么:
(1) 同步代码中各处对象的所有访问(getter 和 setter)。请小心设置器:确保不能将对象设置为不一致的状态。例如,在设置名字和姓氏时,拥有两个同步设置器是不够的:您必须同时获取这两个操作的对象锁。

(2) 当您对象放入映射中时,请放置深层副本而不是对象本身。这样其他线程永远不会读取处于不一致状态的对象。

编辑
我刚刚注意到

当前对象的代码在“setter”上具有同步
(由对象本身保护)。上没有同步
“getters”也不是不稳定的成员。

这不好。正如我上面所说,仅在一个线程上同步根本就不是同步。您可能会在所有编写器线程上同步,但谁在乎呢,因为读者不会获得正确的值。

You call concurrHashMap.put after each write to an object. However you did not specified that you also call concurrHashMap.get before each read. This is necessary.

This is true of all forms of synchronization: you need to have some "checkpoints" in both threads. Synchronizing only one thread is useless.

I haven't checked the source code of ConcurrentHashMap to make sure that put and get trigger an happens-before, but it is only logical that they should.

There is still an issue with your method however, even if you use both put and get. The problem happens when you modify an object and it is used (in an inconsistent state) by the other thread before it is put. It's a subtle problem because you might think the old value would be read since it hasn't been put yet and it would not cause a problem. The problem is that when you don't synchronize, you are not guaranteed to get a consistent older object, but rather the behavior is undefined. The JVM can update whatever part of the object in the other threads, at any time. It's only when using some explicit synchronization that you are sure you are updating the values in a consistent way across threads.

What you could do:
(1) synchronize all accesses (getters and setters) to your objects everywhere in the code. Be careful with the setters: make sure that you can't set the object in an inconsistent state. For example, when setting first and last name, having two synchronized setters is not sufficient: you must get the object lock for both operations together.
or
(2) when you put an object in the map, put a deep copy instead of the object itself. That way the other threads will never read an object in an inconsistent state.

EDIT:
I just noticed

Currently the objects' code has synchronization on the "setters"
(guarded by the object itself). There is no synchronization on the
"getters" nor are the members volatile.

This is not good. As I said above synchronizing on only one thread is no synchronization at all. You might synchronize on all your writer threads, but who cares since the readers won't get the right values.

我要还你自由 2024-12-17 07:48:49

我认为这已经在几个答案中说过了,但总结一下

如果您的代码进行

  • CHM#get
  • 调用各种设置器
  • CHM#put

那么 put 提供的“happens-before”将保证所有 mutate 调用都在之前执行看跌期权。这意味着任何后续获取都将保证看到这些更改。

您的问题是对象的实际状态将不确定,因为如果实际事件流是

  • 线程 1:CHM#get
  • 线程 1:调用 setter
  • 线程 2:CHM#get
  • 线程 1:调用 setter
  • 线程 1:调用 setter
  • 线程1:CHM#put

那么无法保证线程 2 中对象的状态。它可能会看到具有第一个 setter 提供的值的对象,也可能不会。

不可变副本将是最好的方法,因为只有完全一致的对象才会被发布。使各种 setter 同步(或底层引用易失)仍然不允许您发布一致的状态,它只是意味着该对象将始终在每次调用时看到每个 getter 的最新值。

I think this has been already said across a few answers but to sum it up

If your code goes

  • CHM#get
  • call various setters
  • CHM#put

then the "happens-before" provided by the put will guarantee that all the mutate calls are executed before the put. This means that any subsequent get will be guaranteed to see those changes.

Your problem is that the actual state of the object will not be deterministic because if the actual flow of events is

  • thread 1: CHM#get
  • thread 1: call setter
  • thread 2: CHM#get
  • thread 1: call setter
  • thread 1: call setter
  • thread 1: CHM#put

then there is no guarantee over what the state of the object will be in thread 2. It might see the object with the value provided by the first setter or it might not.

The immutable copy would be the best approach as then only completely consistent objects are published. Making the various setters synchronized (or the underlying references volatile) still doesn't let you publish consistent state, it just means that the object will always see the latest value for each getter on each call.

﹎☆浅夏丿初晴 2024-12-17 07:48:49

我认为您的问题更多地与您在地图中存储的对象以及它们对并发访问的反应有关,而不是与并发地图本身有关。

如果您存储在映射中的实例具有同步变元,但没有同步访问器,那么我不明白它们如何能够如所描述的那样是线程安全的。

Map 排除在外,并确定您存储的实例本身是否是线程安全的。

但是,当对象被修改时,它会被重新放回映射中(再次调用 put() 方法,相同的键)。这是否意味着当另一个线程将对象从映射中拉出时,它将看到修改?

这就是混乱的例证。重新放入 Map 的实例将由另一个线程从 Map 中检索。这就是并发map的保证。这与存储实例本身状态的可见性无关。

I think your question relates more to the objects you're storing in the map, and how they react to concurrent access, than the concurrent map itself.

If the instances you're storing in the map have synchronized mutators, but not synchronized accessors, then I don't see how they can be thread safe as described.

Take the Map out of the equation and determine if the instances you're storing are thread safe by themselves.

However, when an object is modified it is re-put back into the map (the put() method is called again, same key). Does this mean that when another thread pulls the object out of the map, it will see the modifications?

This exemplifies the confusion. The instance that is re-put into the Map will be retrieved from the Map by another thread. This is the guarantee of the concurrent map. That has nothing to do with visibility of the state of the stored instance itself.

时常饿 2024-12-17 07:48:49

我的理解是,它应该适用于重新放置后的所有获取,但这将是一种非常不安全的同步方法。

在重新放置之前,但在修改发生时,会发生什么情况呢?他们可能只看到一些更改,并且对象将具有不一致的状态。

如果可以的话,我建议在地图中存储不可变的对象。然后任何 get 都会检索执行 get 时当前对象的版本。

My understanding is that it should work for all gets after the re-put, but this would be a very unsafe method of synchronization.

What happens to gets that happen before the re-put, but while modifications are happening. They may see only some of the changes, and the object would have an inconsistent state.

If you can, I'd recommend store immutable objects in the map. Then any get will retrieve a version of the object that was current when it did the get.

铁憨憨 2024-12-17 07:48:49

这是来自 java.util.concurrent 的代码片段.ConcurrentHashMap(开放 JDK 7):

  919       public V get(Object key) {
  920           Segment<K,V> s; // manually integrate access methods to reduce overhead
  921           HashEntry<K,V>[] tab;
  922           int h = hash(key.hashCode());
  923           long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
  924           if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
  925               (tab = s.table) != null) {
  926               for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
  927                        (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
  928                    e != null; e = e.next) {
  929                   K k;
  930                   if ((k = e.key) == key || (e.hash == h && key.equals(k)))
  931                       return e.value;
  932               }
  933           }
  934           return null;
  935       }

UNSAFE.getObjectVolatile()记录为具有内部 < 的 getter code>易失性语义,因此在获取引用时会跨越内存屏障。

That's a code snippet from java.util.concurrent.ConcurrentHashMap (Open JDK 7):

  919       public V get(Object key) {
  920           Segment<K,V> s; // manually integrate access methods to reduce overhead
  921           HashEntry<K,V>[] tab;
  922           int h = hash(key.hashCode());
  923           long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
  924           if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
  925               (tab = s.table) != null) {
  926               for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
  927                        (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
  928                    e != null; e = e.next) {
  929                   K k;
  930                   if ((k = e.key) == key || (e.hash == h && key.equals(k)))
  931                       return e.value;
  932               }
  933           }
  934           return null;
  935       }

UNSAFE.getObjectVolatile() is documented as getter with internal volatile semantics, thus the memory barrier will be crossed when getting the reference.

缱倦旧时光 2024-12-17 07:48:49

是的,即使键值已经存在于映射中,put 也会导致易失性写入。

使用ConcurrentHashMap跨线程发布对象非常有效。一旦对象位于地图中,就不应进一步修改它们。 (它们不必严格不可变(带有最终字段))

yes, put incurs a volatile write, even if key-value already exists in the map.

using ConcurrentHashMap to publish objects across thread is pretty effecient. Objects should not be modified further once they are in the map. (They don't have to be strictly immutable (with final fields))

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文