ConcurrentHashMap.get() 是否能保证通过不同的线程看到先前的 ConcurrentHashMap.put() ？

发布于 2024-08-12 16:13:20 字数 2752 浏览 2 评论 0原文

是 ConcurrentHashMap.get() 保证看到以前的ConcurrentHashMap.put() 通过不同的线程？我的期望是这样，阅读 JavaDocs 似乎表明了这一点，但我 99% 相信现实是不同的。在我的生产服务器上，似乎正在发生以下情况。（我已经通过日志记录捕获了它。）

伪代码示例：

static final ConcurrentHashMap map = new ConcurrentHashMap();
//sharedLock is key specific.  One map, many keys.  There is a 1:1 
//      relationship between key and Foo instance.
void doSomething(Semaphore sharedLock) {
    boolean haveLock = sharedLock.tryAcquire(3000, MILLISECONDS);

    if (haveLock) {
        log("Have lock: " + threadId);
        Foo foo = map.get("key");
        log("foo=" + foo);

        if (foo == null) {
            log("New foo time! " + threadId);
            foo = new Foo(); //foo is expensive to instance
            map.put("key", foo);

        } else
            log("Found foo:" + threadId);

        log("foo=" + foo);
        sharedLock.release();

    } else
        log("No lock acquired");
}

似乎发生的情况是这样的：

Thread 1                          Thread 2
 - request lock                    - request lock
 - have lock                       - blocked waiting for lock
 - get from map, nothing there
 - create new foo
 - place new foo in map
 - logs foo.toString()
 - release lock
 - exit method                     - have lock
                                   - get from map, NOTHING THERE!!! (Why not?)
                                   - create new foo
                                   - place new foo in map
                                   - logs foo.toString()
                                   - release lock
                                   - exit method

因此，我的输出如下所示：

Have lock: 1    
foo=null
New foo time! 1
foo=foo@cafebabe420
Have lock: 2    
foo=null
New foo time! 2
foo=foo@boof00boo

第二个线程不会立即看到 put！为什么？在我的生产系统上，有更多线程，但我只看到一个线程（紧随线程 1 的第一个线程）出现问题。

我什至尝试将 ConcurrentHashMap 的并发级别缩小到 1，但这并不重要。例如：

static ConcurrentHashMap map = new ConcurrentHashMap(32, 1);

我哪里错了？我的期望？或者我的代码（真正的软件，而不是上面的软件）中是否存在导致此问题的错误？我已经反复检查过它，并且 99% 确信我正确地处理了锁定。我什至无法理解 ConcurrentHashMap 或 JVM 中的错误。 请救救我。

可能相关的 Gorey 细节：

四核 64 位 Xeon (DL380 G5)
RHEL4 (Linux mysvr 2.6.9-78.0 .5.ELsmp #1 SMP ... x86_64 GNU/Linux)
Java 6（build 1.6.0_07-b06，64 位服务器VM（构建10.0-b23，混合模式））

原文

Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread? My expectation is that is is, and reading the JavaDocs seems to indicate so, but I am 99% convinced that reality is different. On my production server the below seems to be happening. (I've caught it with logging.)

Pseudo code example:

static final ConcurrentHashMap map = new ConcurrentHashMap();
//sharedLock is key specific.  One map, many keys.  There is a 1:1 
//      relationship between key and Foo instance.
void doSomething(Semaphore sharedLock) {
    boolean haveLock = sharedLock.tryAcquire(3000, MILLISECONDS);

    if (haveLock) {
        log("Have lock: " + threadId);
        Foo foo = map.get("key");
        log("foo=" + foo);

        if (foo == null) {
            log("New foo time! " + threadId);
            foo = new Foo(); //foo is expensive to instance
            map.put("key", foo);

        } else
            log("Found foo:" + threadId);

        log("foo=" + foo);
        sharedLock.release();

    } else
        log("No lock acquired");
}

What seems to be happening is this:

Thread 1                          Thread 2
 - request lock                    - request lock
 - have lock                       - blocked waiting for lock
 - get from map, nothing there
 - create new foo
 - place new foo in map
 - logs foo.toString()
 - release lock
 - exit method                     - have lock
                                   - get from map, NOTHING THERE!!! (Why not?)
                                   - create new foo
                                   - place new foo in map
                                   - logs foo.toString()
                                   - release lock
                                   - exit method

So, my output looks like this:

Have lock: 1    
foo=null
New foo time! 1
foo=foo@cafebabe420
Have lock: 2    
foo=null
New foo time! 2
foo=foo@boof00boo

The second thread does not immediately see the put! Why? On my production system, there are more threads and I've only seen one thread, the first one that immediately follows thread 1, have a problem.

I've even tried shrinking the concurrency level on ConcurrentHashMap to 1, not that it should matter. E.g.:

static ConcurrentHashMap map = new ConcurrentHashMap(32, 1);

Where am I going wrong? My expectation? Or is there some bug in my code (the real software, not the above) that is causing this? I've gone over it repeatedly and am 99% sure I'm handling the locking correctly. I cannot even fathom a bug in ConcurrentHashMap or the JVM. Please save me from myself.

Gorey specifics that might be relevant:

quad-core 64-bit Xeon (DL380 G5)
RHEL4 (Linux mysvr 2.6.9-78.0.5.ELsmp #1 SMP ... x86_64 GNU/Linux)
Java 6 (build 1.6.0_07-b06, 64-Bit Server VM (build 10.0-b23, mixed mode))

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

擦肩而过的背影 2024-08-19 16:13:20

这里有一些很好的答案，但据我所知，没有人实际上对所提出的问题提供了规范的答案：“ConcurrentHashMap.get() 是否能保证通过不同的线程看到先前的 ConcurrentHashMap.put()”。那些说是的人没有提供消息来源。

所以：是的，这是有保证的。来源（请参阅“内存一致性属性”）：

将对象放入任何并发集合之前的线程中的操作发生在另一个线程中从集合中访问或删除该元素之后的操作。

回复收藏 0 原文

悲喜皆因你 2024-08-19 16:13:20

由于在缓存中找不到对象而在缓存中创建创建成本高昂的对象的问题是已知问题。幸运的是，这已经实施了。

您可以使用 MapMaker Google 收藏集。您只需给它一个创建对象的回调，如果客户端代码在地图中查找并且地图为空，则调用回调并将结果放入地图中。

请参阅 MapMaker javadocs ..顺便

 ConcurrentMap<Key, Graph> graphs = new MapMaker()
       .concurrencyLevel(32)
       .softKeys()
       .weakValues()
       .expiration(30, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function<Key, Graph>() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

说一句，在您原来的示例中，使用 ConcurrentHashMap 没有任何优势，因为您要锁定每个访问，为什么不在锁定部分中使用普通的 HashMap 呢？

This issue of creating an expensive-to-create object in a cache based on a failure to find it in the cache is known problem. And fortunately this had already been implemented.

You can use MapMaker from Google Collecitons. You just give it a callback that creates your object, and if the client code looks in the map and the map is empty, the callback is called and the result put in the map.

See MapMaker javadocs ...

 ConcurrentMap<Key, Graph> graphs = new MapMaker()
       .concurrencyLevel(32)
       .softKeys()
       .weakValues()
       .expiration(30, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function<Key, Graph>() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

BTW, in your original example there is no advantage to using a ConcurrentHashMap, as you are locking each access, why not just use a normal HashMap inside your locked section?

回复收藏 0 原文

天暗了我发光 2024-08-19 16:13:20

如果一个线程将一个值放入并发哈希映射中，则检索映射值的其他线程一定会看到前一个线程插入的值。

这个问题在Joshua Bloch的《Java并发实践》中得到了阐明。

引用正文：-

线程安全库集合提供以下安全发布保证，即使 javadoc 在该主题上不太清楚：
将键或值放入 Hashtable、synchronizedMap 或 Concurrent-Map 中可以安全地将其发布到从映射（无论是直接映射还是通过迭代器映射）；

回复收藏 0 原文

想你只要分分秒秒 2024-08-19 16:13:20

需要考虑的一件事是，您的密钥是否相等并且在“get”调用的两次都有相同的哈希码。如果它们只是String，那么是的，这里不会有问题。但是，由于您没有给出键的通用类型，并且您在伪代码中省略了“不重要”的细节，所以我想知道您是否使用另一个类作为键。

在任何情况下，您可能需要额外记录用于线程 1 和 2 中的获取/放置的键的哈希码。如果它们不同，那么您就会遇到问题。另请注意，key1.equals(key2) 必须为 true；这不是您可以明确记录的内容，但如果键不是最终类，则值得记录其完全限定的类名，然后查看该类的 equals() 方法，看看是否有可能第二个密钥可以被认为与第一个密钥不同。

回答你的标题 - 是的，ConcurrentHashMap.get() 保证看到任何以前的 put()，其中“前一个”意味着两者之间存在 happens-before 关系，如 Java 所指定的内存模型。（特别是对于 ConcurrentHashMap，这本质上是您所期望的，但需要注意的是，如果两个线程在不同的核心上“完全相同的时间”执行，您可能无法判断哪个先发生。但就您的情况而言，，您肯定应该看到线程 2) 中 put() 的结果。

回复收藏 0 原文

天暗了我发光 2024-08-19 16:13:20

我不认为问题出在“ConcurrentHashMap”中，而是出在代码中的某个地方或关于代码的推理。我无法发现上面代码中的错误（也许我们只是没有看到不好的部分？）。

但要回答你的问题“ConcurrentHashMap.get() 是否保证能通过不同的线程看到以前的 ConcurrentHashMap.put() ？”我编写了一个小测试程序。

简而言之：不，ConcurrentHashMap 没问题！

如果映射写得不好，下面的程序应该打印“Bad access!”至少有时。它抛出 100 个线程，并调用您上面概述的方法 100000 次。但它打印“一切都好！”。

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class Test {
    private final static ConcurrentHashMap<String, Test> map = new ConcurrentHashMap<String, Test>();
    private final static Semaphore lock = new Semaphore(1);
    private static int counter = 0;

    public static void main(String[] args) throws InterruptedException {
        ExecutorService pool = Executors.newFixedThreadPool(100);
        List<Callable<Boolean>> testCalls = new ArrayList<Callable<Boolean>>();
        for (int n = 0; n < 100000; n++)
            testCalls.add(new Callable<Boolean>() {
                @Override
                public Boolean call() throws Exception {
                    doSomething(lock);
                    return true;
                }
            });
        pool.invokeAll(testCalls);
        pool.shutdown();
        pool.awaitTermination(5, TimeUnit.SECONDS);
        System.out.println("All ok!");
    }

    static void doSomething(Semaphore lock) throws InterruptedException {
        boolean haveLock = lock.tryAcquire(3000, TimeUnit.MILLISECONDS);

        if (haveLock) {
            Test foo = map.get("key");
            if (foo == null) {
                foo = new Test();
                map.put("key", new Test());
                if (counter > 0)
                    System.err.println("Bad access!");
                counter++;
            }
            lock.release();
        } else {
            System.err.println("Fail to lock!");
        }
    }
}

I don't think the problem is in "ConcurrentHashMap" but rather somewhere in your code or about the reasoning about your code. I can't spot the error in the code above (maybe we just don't see the bad part?).

But to answer your question "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread?" I've hacked together a small test program.

In short: No, ConcurrentHashMap is OK!

If the map is written badly the following program shoukd print "Bad access!" at least from time to time. It throws 100 Threads with 100000 calls to the method you outlined above. But it prints "All ok!".

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class Test {
    private final static ConcurrentHashMap<String, Test> map = new ConcurrentHashMap<String, Test>();
    private final static Semaphore lock = new Semaphore(1);
    private static int counter = 0;

    public static void main(String[] args) throws InterruptedException {
        ExecutorService pool = Executors.newFixedThreadPool(100);
        List<Callable<Boolean>> testCalls = new ArrayList<Callable<Boolean>>();
        for (int n = 0; n < 100000; n++)
            testCalls.add(new Callable<Boolean>() {
                @Override
                public Boolean call() throws Exception {
                    doSomething(lock);
                    return true;
                }
            });
        pool.invokeAll(testCalls);
        pool.shutdown();
        pool.awaitTermination(5, TimeUnit.SECONDS);
        System.out.println("All ok!");
    }

    static void doSomething(Semaphore lock) throws InterruptedException {
        boolean haveLock = lock.tryAcquire(3000, TimeUnit.MILLISECONDS);

        if (haveLock) {
            Test foo = map.get("key");
            if (foo == null) {
                foo = new Test();
                map.put("key", new Test());
                if (counter > 0)
                    System.err.println("Bad access!");
                counter++;
            }
            lock.release();
        } else {
            System.err.println("Fail to lock!");
        }
    }
}

回复收藏 0 原文

人生百味 2024-08-19 16:13:20

更新： putIfAbsent() 在这里逻辑上是正确的，但并不能避免在密钥不存在的情况下仅创建 Foo 的问题。它总是创建 Foo，即使它最终没有将其放入地图中。 David Roussel 的答案很好，假设您可以接受应用程序中的 Google Collections 依赖项。

也许我遗漏了一些明显的东西，但你为什么用信号量守卫地图？ ConcurrentHashMap (CHM) 是线程安全的（假设它是安全发布的，就在这里）。如果您试图获得原子“如果还没有放入其中”，请使用 chm.putIfAbsent()。如果您需要更复杂的不变量（其中映射内容无法更改），您可能需要使用常规 HashMap 并像往常一样同步它。

更直接地回答您的问题：一旦您的 put 返回，您放入映射中的值保证可以被下一个查找它的线程看到。

旁注，只是对有关将信号量释放放在最后的其他一些评论的+1。

if (sem.tryAcquire(3000, TimeUnit.MILLISECONDS)) {
    try {
        // do stuff while holding permit    
    } finally {
        sem.release();
    }
}

Update: putIfAbsent() is logically correct here, but doesn't avoid the problem of only creating a Foo in the case where the key is not present. It always creates the Foo, even if it doesn't end up putting it in the map. David Roussel's answer is good, assuming you can accept the Google Collections dependency in your app.

Maybe I'm missing something obvious, but why are you guarding the map with a Semaphore? ConcurrentHashMap (CHM) is thread-safe (assuming it's safely published, which it is here). If you're trying to get atomic "put if not already in there", use chm.putIfAbsent(). If you need more complciated invariants where the map contents cannot change, you probably need to use a regular HashMap and synchronize it as usual.

To answer your question more directly: Once your put returns, the value you put in the map is guaranteed to be seen by the next thread that looks for it.

Side note, just a +1 to some other comments about putting the semaphore release in a finally.

if (sem.tryAcquire(3000, TimeUnit.MILLISECONDS)) {
    try {
        // do stuff while holding permit    
    } finally {
        sem.release();
    }
}

回复收藏 0 原文

节枝 2024-08-19 16:13:20

我们是否看到了 Java 内存模型的有趣表现？在什么条件下寄存器会刷新到主存？我认为如果两个线程在同一个对象上同步，那么它们将看到一致的内存视图。

我不知道Semphore内部做了什么，它几乎显然必须做一些同步，但我们知道吗？

如果您这样做

synchronize(dedicatedLockObject)

而不是获取信号量，会发生什么？

Are we seeing an interesting manifestation of the Java Memory Model? Under what conditions are registers flushed to main memory? I think it's guaranteed that if two threads synchronize on the same object then they will see a consistent memory view.

I don't know what Semphore does internally, it almost obviously must do some synchronize, but do we know that?

What happens if you do