ConcurrentHashMap.get() 是否能保证通过不同的线程看到先前的 ConcurrentHashMap.put() ?

发布于 2024-08-12 16:13:20 字数 2752 浏览 2 评论 0原文

ConcurrentHashMap.get() 保证看到以前的ConcurrentHashMap.put() 通过不同的线程?我的期望是这样,阅读 JavaDocs 似乎表明了这一点,但我 99% 相信现实是不同的。在我的生产服务器上,似乎正在发生以下情况。 (我已经通过日志记录捕获了它。)

伪代码示例:

static final ConcurrentHashMap map = new ConcurrentHashMap();
//sharedLock is key specific.  One map, many keys.  There is a 1:1 
//      relationship between key and Foo instance.
void doSomething(Semaphore sharedLock) {
    boolean haveLock = sharedLock.tryAcquire(3000, MILLISECONDS);

    if (haveLock) {
        log("Have lock: " + threadId);
        Foo foo = map.get("key");
        log("foo=" + foo);

        if (foo == null) {
            log("New foo time! " + threadId);
            foo = new Foo(); //foo is expensive to instance
            map.put("key", foo);

        } else
            log("Found foo:" + threadId);

        log("foo=" + foo);
        sharedLock.release();

    } else
        log("No lock acquired");
} 

似乎发生的情况是这样的:

Thread 1                          Thread 2
 - request lock                    - request lock
 - have lock                       - blocked waiting for lock
 - get from map, nothing there
 - create new foo
 - place new foo in map
 - logs foo.toString()
 - release lock
 - exit method                     - have lock
                                   - get from map, NOTHING THERE!!! (Why not?)
                                   - create new foo
                                   - place new foo in map
                                   - logs foo.toString()
                                   - release lock
                                   - exit method

因此,我的输出如下所示:

Have lock: 1    
foo=null
New foo time! 1
foo=foo@cafebabe420
Have lock: 2    
foo=null
New foo time! 2
foo=foo@boof00boo    

第二个线程不会立即看到 put!为什么?在我的生产系统上,有更多线程,但我只看到一个线程(紧随线程 1 的第一个线程)出现问题。

我什至尝试将 ConcurrentHashMap 的并发级别缩小到 1,但这并不重要。例如:

static ConcurrentHashMap map = new ConcurrentHashMap(32, 1);

我哪里错了?我的期望?或者我的代码(真正的软件,而不是上面的软件)中是否存在导致此问题的错误?我已经反复检查过它,并且 99% 确信我正确地处理了锁定。我什至无法理解 ConcurrentHashMap 或 JVM 中的错误。 请救救我。

可能相关的 Gorey 细节:

  • 四核 64 位 Xeon (DL380 G5)
  • RHEL4 (Linux mysvr 2.6.9-78.0 .5.ELsmp #1 SMP ... x86_64 GNU/Linux)
  • Java 6(build 1.6.0_07-b0664 位服务器VM(构建10.0-b23,混合模式)

Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread? My expectation is that is is, and reading the JavaDocs seems to indicate so, but I am 99% convinced that reality is different. On my production server the below seems to be happening. (I've caught it with logging.)

Pseudo code example:

static final ConcurrentHashMap map = new ConcurrentHashMap();
//sharedLock is key specific.  One map, many keys.  There is a 1:1 
//      relationship between key and Foo instance.
void doSomething(Semaphore sharedLock) {
    boolean haveLock = sharedLock.tryAcquire(3000, MILLISECONDS);

    if (haveLock) {
        log("Have lock: " + threadId);
        Foo foo = map.get("key");
        log("foo=" + foo);

        if (foo == null) {
            log("New foo time! " + threadId);
            foo = new Foo(); //foo is expensive to instance
            map.put("key", foo);

        } else
            log("Found foo:" + threadId);

        log("foo=" + foo);
        sharedLock.release();

    } else
        log("No lock acquired");
} 

What seems to be happening is this:

Thread 1                          Thread 2
 - request lock                    - request lock
 - have lock                       - blocked waiting for lock
 - get from map, nothing there
 - create new foo
 - place new foo in map
 - logs foo.toString()
 - release lock
 - exit method                     - have lock
                                   - get from map, NOTHING THERE!!! (Why not?)
                                   - create new foo
                                   - place new foo in map
                                   - logs foo.toString()
                                   - release lock
                                   - exit method

So, my output looks like this:

Have lock: 1    
foo=null
New foo time! 1
foo=foo@cafebabe420
Have lock: 2    
foo=null
New foo time! 2
foo=foo@boof00boo    

The second thread does not immediately see the put! Why? On my production system, there are more threads and I've only seen one thread, the first one that immediately follows thread 1, have a problem.

I've even tried shrinking the concurrency level on ConcurrentHashMap to 1, not that it should matter. E.g.:

static ConcurrentHashMap map = new ConcurrentHashMap(32, 1);

Where am I going wrong? My expectation? Or is there some bug in my code (the real software, not the above) that is causing this? I've gone over it repeatedly and am 99% sure I'm handling the locking correctly. I cannot even fathom a bug in ConcurrentHashMap or the JVM. Please save me from myself.

Gorey specifics that might be relevant:

  • quad-core 64-bit Xeon (DL380 G5)
  • RHEL4 (Linux mysvr 2.6.9-78.0.5.ELsmp #1 SMP ... x86_64 GNU/Linux)
  • Java 6 (build 1.6.0_07-b06, 64-Bit Server VM (build 10.0-b23, mixed mode))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

擦肩而过的背影 2024-08-19 16:13:20

这里有一些很好的答案,但据我所知,没有人实际上对所提出的问题提供了规范的答案:“ConcurrentHashMap.get() 是否能保证通过不同的线程看到先前的 ConcurrentHashMap.put()”。那些说是的人没有提供消息来源。

所以:是的,这是有保证的。 来源(请参阅“内存一致性属性”):

将对象放入任何并发集合之前的线程中的操作发生在另一个线程中从集合中访问或删除该元素之后的操作。

Some good answers here, but as far as I can tell no-one has actually provided a canonical answer to the question asked: "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread". Those that have said yes haven't provided a source.

So: yes, it is guaranteed. Source (see the section 'Memory Consistency Properties'):

Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

悲喜皆因你 2024-08-19 16:13:20

由于在缓存中找不到对象而在缓存中创建创建成本高昂的对象的问题是已知问题。幸运的是,这已经实施了。

您可以使用 MapMaker Google 收藏集。您只需给它一个创建对象的回调,如果客户端代码在地图中查找并且地图为空,则调用回调并将结果放入地图中。

请参阅 MapMaker javadocs ..顺便

 ConcurrentMap<Key, Graph> graphs = new MapMaker()
       .concurrencyLevel(32)
       .softKeys()
       .weakValues()
       .expiration(30, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function<Key, Graph>() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

说一句,在您原来的示例中,使用 ConcurrentHashMap 没有任何优势,因为您要锁定每个访问,为什么不在锁定部分中使用普通的 HashMap 呢?

This issue of creating an expensive-to-create object in a cache based on a failure to find it in the cache is known problem. And fortunately this had already been implemented.

You can use MapMaker from Google Collecitons. You just give it a callback that creates your object, and if the client code looks in the map and the map is empty, the callback is called and the result put in the map.

See MapMaker javadocs ...

 ConcurrentMap<Key, Graph> graphs = new MapMaker()
       .concurrencyLevel(32)
       .softKeys()
       .weakValues()
       .expiration(30, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function<Key, Graph>() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

BTW, in your original example there is no advantage to using a ConcurrentHashMap, as you are locking each access, why not just use a normal HashMap inside your locked section?

天暗了我发光 2024-08-19 16:13:20

如果一个线程将一个值放入并发哈希映射中,则检索映射值的其他线程一定会看到前一个线程插入的值。

这个问题在Joshua Bloch的《Java并发实践》中得到了阐明。

引用正文:-

线程安全库集合提供以下安全发布保证,即使 javadoc 在该主题上不太清楚:

  • 将键或值放入 HashtablesynchronizedMapConcurrent-Map 中可以安全地将其发布到从映射(无论是直接映射还是通过迭代器映射);

If a thread puts a value in concurrent hash map then some other thread that retrieves the value for the map is guaranteed to see the values inserted by the previous thread.

This issue has been clarified in "Java Concurrency in Practice" by Joshua Bloch.

Quoting from the text :-

The thread-safe library collections offer the following safe publication guarantees, even if the javadoc is less than clear on the subject:

  • Placing a key or value in a Hashtable, synchronizedMap or Concurrent-Map safely publishes it to any other thread that retrieves it from the Map (whether directly or via an iterator);
想你只要分分秒秒 2024-08-19 16:13:20

需要考虑的一件事是,您的密钥是否相等并且在“get”调用的两次都有相同的哈希码。如果它们只是String,那么是的,这里不会有问题。但是,由于您没有给出键的通用类型,并且您在伪代码中省略了“不重要”的细节,所以我想知道您是否使用另一个类作为键。

在任何情况下,您可能需要额外记录用于线程 1 和 2 中的获取/放置的键的哈希码。如果它们不同,那么您就会遇到问题。另请注意,key1.equals(key2) 必须为 true;这不是您可以明确记录的内容,但如果键不是最终类,则值得记录其完全限定的类名,然后查看该类的 equals() 方法,看看是否有可能第二个密钥可以被认为与第一个密钥不同。

回答你的标题 - 是的,ConcurrentHashMap.get() 保证看到任何以前的 put(),其中“前一个”意味着两者之间存在 happens-before 关系,如 Java 所指定的内存模型。 (特别是对于 ConcurrentHashMap,这本质上是您所期望的,但需要注意的是,如果两个线程在不同的核心上“完全相同的时间”执行,您可能无法判断哪个先发生。但就您的情况而言, ,您肯定应该看到线程 2) 中 put() 的结果。

One thing to consider, is whether your keys are equal and have identical hashcodes at both times of the "get" call. If they're just Strings then yes, there's not going to be a problem here. But as you haven't given the generic type of the keys, and you have elided "unimportant" details in the pseudocode, I wonder if you're using another class as a key.

In any case, you may want to additionally log the hashcode of the keys used for the gets/puts in threads 1 and 2. If these are different, you have your problem. Also note that key1.equals(key2) must be true; this isn't something you can log definitively, but if the keys aren't final classes it would be worth logging their fully qualified class name, then looking at the equals() method for that class/classes to see if it's possible that the second key could be considered unequal to the first.

And to answer your title - yes, ConcurrentHashMap.get() is guaranteed to see any previous put(), where "previous" means there is a happens-before relationship between the two as specified by the Java Memory Model. (For the ConcurrentHashMap in particular, this is essentially what you'd expect, with the caveat that you may not be able to tell which happens first if both threads execute at "exactly the same time" on different cores. In your case, though, you should definitely see the result of the put() in thread 2).

天暗了我发光 2024-08-19 16:13:20

我不认为问题出在“ConcurrentHashMap”中,而是出在代码中的某个地方或关于代码的推理。我无法发现上面代码中的错误(也许我们只是没有看到不好的部分?)。

但要回答你的问题“ConcurrentHashMap.get() 是否保证能通过不同的线程看到以前的 ConcurrentHashMap.put() ?”我编写了一个小测试程序。

简而言之:不,ConcurrentHashMap 没问题!

如果映射写得不好,下面的程序应该打印“Bad access!”至少有时。它抛出 100 个线程,并调用您上面概述的方法 100000 次。但它打印“一切都好!”。

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class Test {
    private final static ConcurrentHashMap<String, Test> map = new ConcurrentHashMap<String, Test>();
    private final static Semaphore lock = new Semaphore(1);
    private static int counter = 0;

    public static void main(String[] args) throws InterruptedException {
        ExecutorService pool = Executors.newFixedThreadPool(100);
        List<Callable<Boolean>> testCalls = new ArrayList<Callable<Boolean>>();
        for (int n = 0; n < 100000; n++)
            testCalls.add(new Callable<Boolean>() {
                @Override
                public Boolean call() throws Exception {
                    doSomething(lock);
                    return true;
                }
            });
        pool.invokeAll(testCalls);
        pool.shutdown();
        pool.awaitTermination(5, TimeUnit.SECONDS);
        System.out.println("All ok!");
    }

    static void doSomething(Semaphore lock) throws InterruptedException {
        boolean haveLock = lock.tryAcquire(3000, TimeUnit.MILLISECONDS);

        if (haveLock) {
            Test foo = map.get("key");
            if (foo == null) {
                foo = new Test();
                map.put("key", new Test());
                if (counter > 0)
                    System.err.println("Bad access!");
                counter++;
            }
            lock.release();
        } else {
            System.err.println("Fail to lock!");
        }
    }
}

I don't think the problem is in "ConcurrentHashMap" but rather somewhere in your code or about the reasoning about your code. I can't spot the error in the code above (maybe we just don't see the bad part?).

But to answer your question "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread?" I've hacked together a small test program.

In short: No, ConcurrentHashMap is OK!

If the map is written badly the following program shoukd print "Bad access!" at least from time to time. It throws 100 Threads with 100000 calls to the method you outlined above. But it prints "All ok!".

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class Test {
    private final static ConcurrentHashMap<String, Test> map = new ConcurrentHashMap<String, Test>();
    private final static Semaphore lock = new Semaphore(1);
    private static int counter = 0;

    public static void main(String[] args) throws InterruptedException {
        ExecutorService pool = Executors.newFixedThreadPool(100);
        List<Callable<Boolean>> testCalls = new ArrayList<Callable<Boolean>>();
        for (int n = 0; n < 100000; n++)
            testCalls.add(new Callable<Boolean>() {
                @Override
                public Boolean call() throws Exception {
                    doSomething(lock);
                    return true;
                }
            });
        pool.invokeAll(testCalls);
        pool.shutdown();
        pool.awaitTermination(5, TimeUnit.SECONDS);
        System.out.println("All ok!");
    }

    static void doSomething(Semaphore lock) throws InterruptedException {
        boolean haveLock = lock.tryAcquire(3000, TimeUnit.MILLISECONDS);

        if (haveLock) {
            Test foo = map.get("key");
            if (foo == null) {
                foo = new Test();
                map.put("key", new Test());
                if (counter > 0)
                    System.err.println("Bad access!");
                counter++;
            }
            lock.release();
        } else {
            System.err.println("Fail to lock!");
        }
    }
}
人生百味 2024-08-19 16:13:20

更新: putIfAbsent() 在这里逻辑上是正确的,但并不能避免在密钥不存在的情况下仅创建 Foo 的问题。它总是创建 Foo,即使它最终没有将其放入地图中。 David Roussel 的答案很好,假设您可以接受应用程序中的 Google Collections 依赖项。


也许我遗漏了一些明显的东西,但你为什么用信号量守卫地图? ConcurrentHashMap (CHM) 是线程安全的(假设它是安全发布的,就在这里)。如果您试图获得原子“如果还没有放入其中”,请使用 chm.putIfAbsent()。如果您需要更复杂的不变量(其中映射内容无法更改),您可能需要使用常规 HashMap 并像往常一样同步它。

更直接地回答您的问题:一旦您的 put 返回,您放入映射中的值保证可以被下一个查找它的线程看到。

旁注,只是对有关将信号量释放放在最后的其他一些评论的+1。

if (sem.tryAcquire(3000, TimeUnit.MILLISECONDS)) {
    try {
        // do stuff while holding permit    
    } finally {
        sem.release();
    }
}

Update: putIfAbsent() is logically correct here, but doesn't avoid the problem of only creating a Foo in the case where the key is not present. It always creates the Foo, even if it doesn't end up putting it in the map. David Roussel's answer is good, assuming you can accept the Google Collections dependency in your app.


Maybe I'm missing something obvious, but why are you guarding the map with a Semaphore? ConcurrentHashMap (CHM) is thread-safe (assuming it's safely published, which it is here). If you're trying to get atomic "put if not already in there", use chm.putIfAbsent(). If you need more complciated invariants where the map contents cannot change, you probably need to use a regular HashMap and synchronize it as usual.

To answer your question more directly: Once your put returns, the value you put in the map is guaranteed to be seen by the next thread that looks for it.

Side note, just a +1 to some other comments about putting the semaphore release in a finally.

if (sem.tryAcquire(3000, TimeUnit.MILLISECONDS)) {
    try {
        // do stuff while holding permit    
    } finally {
        sem.release();
    }
}
节枝 2024-08-19 16:13:20

我们是否看到了 Java 内存模型的有趣表现?在什么条件下寄存器会刷新到主存?我认为如果两个线程在同一个对象上同步,那么它们将看到一致的内存视图。

我不知道Semphore内部做了什么,它几乎显然必须做一些同步,但我们知道吗?

如果您这样做

synchronize(dedicatedLockObject)

而不是获取信号量,会发生什么?

Are we seeing an interesting manifestation of the Java Memory Model? Under what conditions are registers flushed to main memory? I think it's guaranteed that if two threads synchronize on the same object then they will see a consistent memory view.

I don't know what Semphore does internally, it almost obviously must do some synchronize, but do we know that?

What happens if you do

synchronize(dedicatedLockObject)

instead of aquiring the semaphore?

余生共白头 2024-08-19 16:13:20

为什么要锁定并发哈希图?按定义。它的线程安全。
如果有问题,则问题出在您的锁定代码中。
这就是为什么我们在 Java 中有线程安全包
调试此问题的最佳方法是使用屏障同步。

Why are you locking a concurrent hash map? By def. its thread safe.
If there's a problem, its in your locking code.
That's why we have thread safe packages in Java
The best way to debug this is with barrier synchronization.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文