在使用 ConcurrentMap 的 putIfAbsent 之前是否应该检查映射是否包含Key

发布于 2024-09-24 19:12:43 字数 641 浏览 0 评论 0原文

我一直在使用 Java 的 ConcurrentMap 作为可以从多个线程使用的映射。 putIfAbsent 是一个很棒的方法，并且比使用标准映射操作更容易读/写。我有一些代码如下所示：

ConcurrentMap<String, Set<X>> map = new ConcurrentHashMap<String, Set<X>>();

// ...

map.putIfAbsent(name, new HashSet<X>());
map.get(name).add(Y);

可读性方面这很棒，但它确实需要每次创建一个新的 HashSet，即使它已经在地图中。我可以这样写：

if (!map.containsKey(name)) {
    map.putIfAbsent(name, new HashSet<X>());
}
map.get(name).add(Y);

通过此更改，它会失去一些可读性，但不需要每次都创建 HashSet。在这种情况下哪个更好？我倾向于第一个，因为它更具可读性。第二个会表现更好并且可能更正确。也许有比这两种方法更好的方法来做到这一点。

以这种方式使用 putIfAbsent 的最佳实践是什么？

原文

I have been using Java's ConcurrentMap for a map that can be used from multiple threads. The putIfAbsent is a great method and is much easier to read/write than using standard map operations. I have some code that looks like this:

ConcurrentMap<String, Set<X>> map = new ConcurrentHashMap<String, Set<X>>();

// ...

map.putIfAbsent(name, new HashSet<X>());
map.get(name).add(Y);

Readability wise this is great but it does require creating a new HashSet every time even if it is already in the map. I could write this:

if (!map.containsKey(name)) {
    map.putIfAbsent(name, new HashSet<X>());
}
map.get(name).add(Y);

With this change it loses a bit of readability but does not need to create the HashSet every time. Which is better in this case? I tend to side with the first one since it is more readable. The second would perform better and may be more correct. Maybe there is a better way to do this than either of these.

What is the best practice for using a putIfAbsent in this manner?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

习惯那些不曾习惯的习惯 2024-10-01 19:12:43

并发很难。如果您打算使用并发映射而不是直接锁定，那么您也可以选择它。事实上，不要进行不必要的查找。

Set<X> set = map.get(name);
if (set == null) {
    final Set<X> value = new HashSet<X>();
    set = map.putIfAbsent(name, value);
    if (set == null) {
        set = value;
    }
}

（通常的 stackoverflow 免责声明：我的想法。未测试。未编译。等等）

更新： 1.8 已将 computeIfAbsent 默认方法添加到 ConcurrentMap （和 Map ，这很有趣，因为对于 ConcurrentMap 来说，该实现是错误的）。（并且 1.7 添加了“钻石运算符”<>。）

Set<X> set = map.computeIfAbsent(name, n -> new HashSet<>());

（注意，您负责包含在 HashSet 中的任何操作的线程安全性） ConcurrentMap。）

Concurrency is hard. If you are going to bother with concurrent maps instead of straightforward locking, you might as well go for it. Indeed, don't do lookups more than necessary.

Set<X> set = map.get(name);
if (set == null) {
    final Set<X> value = new HashSet<X>();
    set = map.putIfAbsent(name, value);
    if (set == null) {
        set = value;
    }
}

(Usual stackoverflow disclaimer: Off the top of my head. Not tested. Not compiled. Etc.)

Update: 1.8 has added computeIfAbsent default method to ConcurrentMap (and Map which is kind of interesting because that implementation would be wrong for ConcurrentMap). (And 1.7 added the "diamond operator" <>.)

Set<X> set = map.computeIfAbsent(name, n -> new HashSet<>());

(Note, you are responsible for the thread-safety of any operations of the HashSets contained in the ConcurrentMap.)

回复收藏 0 原文

不离久伴 2024-10-01 19:12:43

就 ConcurrentMap 的 API 使用而言，Tom 的答案是正确的。避免使用 putIfAbsent 的另一种方法是使用 GoogleCollections/Guava MapMaker 中的计算映射，它会使用提供的函数自动填充值并为您处理所有线程安全性。它实际上只为每个键创建一个值，如果创建函数的开销很大，则请求获取相同键的其他线程将阻塞，直到该值可用。

编辑从 Guava 11 开始，MapMaker 已被弃用，并被 Cache/LocalCache/CacheBuilder 内容取代。这个用法稍微复杂一点，但基本上是同构的。

回复收藏 0 原文

心凉 2024-10-01 19:12:43

您可以使用 MutableMap.getIfAbsentPut(K, Function0) 来自 Eclipse Collections （以前的 GS 系列）。

与调用 get()、执行 null 检查，然后调用 putIfAbsent() 相比，优点是我们只需计算一次键的 hashCode，并找到正确的位置在哈希表中一次。在像 org.eclipse.collections.impl.map.mutable.ConcurrentHashMap 这样的 ConcurrentMap 中，getIfAbsentPut() 的实现也是线程安全和原子的。

import org.eclipse.collections.impl.map.mutable.ConcurrentHashMap;
...
ConcurrentHashMap<String, MyObject> map = new ConcurrentHashMap<>();
map.getIfAbsentPut("key", () -> someExpensiveComputation());

org.eclipse.collections.impl.map.mutable.ConcurrentHashMap 的实现是真正的非阻塞。尽管我们已尽力避免不必要地调用工厂函数，但在争用期间仍有可能多次调用该函数。

这一事实使其与 Java 8 的 ConcurrentHashMap.computeIfAbsent(K, Function)。此方法的 Javadoc 指出：

整个方法调用都是原子执行的，所以函数
每个键最多应用一次。一些尝试的更新操作
当计算正在进行时，其他线程的此映射可能会被阻塞
进度，因此计算应该简短且简单......

注意：我是 Eclipse Collections 的提交者。

You can use MutableMap.getIfAbsentPut(K, Function0<? extends V>) from Eclipse Collections (formerly GS Collections).

The advantage over calling get(), doing a null check, and then calling putIfAbsent() is that we'll only compute the key's hashCode once, and find the right spot in the hashtable once. In ConcurrentMaps like org.eclipse.collections.impl.map.mutable.ConcurrentHashMap, the implementation of getIfAbsentPut() is also thread-safe and atomic.

import org.eclipse.collections.impl.map.mutable.ConcurrentHashMap;
...
ConcurrentHashMap<String, MyObject> map = new ConcurrentHashMap<>();
map.getIfAbsentPut("key", () -> someExpensiveComputation());

The implementation of org.eclipse.collections.impl.map.mutable.ConcurrentHashMap is truly non-blocking. While every effort is made not to call the factory function unnecessarily, there's still a chance it will be called more than once during contention.

This fact sets it apart from Java 8's ConcurrentHashMap.computeIfAbsent(K, Function<? super K,? extends V>). The Javadoc for this method states:

The entire method invocation is performed atomically, so the function
is applied at most once per key. Some attempted update operations on
this map by other threads may be blocked while computation is in
progress, so the computation should be short and simple...

Note: I am a committer for Eclipse Collections.

回复收藏 0 原文

审判长 2024-10-01 19:12:43

通过为每个线程保留预初始化值，您可以改进已接受的答案：

Set<X> initial = new HashSet<X>();
...
Set<X> set = map.putIfAbsent(name, initial);
if (set == null) {
    set = initial;
    initial = new HashSet<X>();
}
set.add(Y);

我最近将其与 AtomicInteger 映射值而不是 Set 一起使用。

By keeping a pre-initialized value for each thread you can improve on the accepted answer:

Set<X> initial = new HashSet<X>();
...
Set<X> set = map.putIfAbsent(name, initial);
if (set == null) {
    set = initial;
    initial = new HashSet<X>();
}
set.add(Y);

I recently used this with AtomicInteger map values rather than Set.

回复收藏 0 原文

携余温的黄昏 2024-10-01 19:12:43

在 5 年多的时间里，我不敢相信没有人提到或发布过使用 ThreadLocal 来解决这个问题的解决方案；并且此页面上的几个解决方案不是线程安全的并且很草率。

使用 ThreadLocals 解决此特定问题不仅被认为是并发性的最佳实践，而且还可以最大限度地减少线程争用期间的垃圾/对象创建。而且，它的代码非常干净。

例如：

private final ThreadLocal<HashSet<X>> 
  threadCache = new ThreadLocal<HashSet<X>>() {
      @Override
      protected
      HashSet<X> initialValue() {
          return new HashSet<X>();
      }
  };


private final ConcurrentMap<String, Set<X>> 
  map = new ConcurrentHashMap<String, Set<X>>();

实际的逻辑......

// minimize object creation during thread contention
final Set<X> cached = threadCache.get();

Set<X> data = map.putIfAbsent("foo", cached);
if (data == null) {
    // reset the cached value in the ThreadLocal
    listCache.set(new HashSet<X>());
    data = cached;
}

// make sure that the access to the set is thread safe
synchronized(data) {
    data.add(object);
}

In 5+ years, I can't believe no one has mentioned or posted a solution that uses ThreadLocal to solve this problem; and several of the solutions on this page are not threadsafe and are just sloppy.

Using ThreadLocals for this specific problem isn't only considered best practices for concurrency, but for minimizing garbage/object creation during thread contention. Also, it's incredibly clean code.

For example:

private final ThreadLocal<HashSet<X>> 
  threadCache = new ThreadLocal<HashSet<X>>() {
      @Override
      protected
      HashSet<X> initialValue() {
          return new HashSet<X>();
      }
  };


private final ConcurrentMap<String, Set<X>> 
  map = new ConcurrentHashMap<String, Set<X>>();

And the actual logic...

// minimize object creation during thread contention
final Set<X> cached = threadCache.get();

Set<X> data = map.putIfAbsent("foo", cached);
if (data == null) {
    // reset the cached value in the ThreadLocal
    listCache.set(new HashSet<X>());
    data = cached;
}

// make sure that the access to the set is thread safe
synchronized(data) {
    data.add(object);
}

回复收藏 0 原文

ぃ弥猫深巷。 2024-10-01 19:12:43

我的通用近似值：

public class ConcurrentHashMapWithInit<K, V> extends ConcurrentHashMap<K, V> {
  private static final long serialVersionUID = 42L;

  public V initIfAbsent(final K key) {
    V value = get(key);
    if (value == null) {
      value = initialValue();
      final V x = putIfAbsent(key, value);
      value = (x != null) ? x : value;
    }
    return value;
  }

  protected V initialValue() {
    return null;
  }
}

作为使用示例：

public static void main(final String[] args) throws Throwable {
  ConcurrentHashMapWithInit<String, HashSet<String>> map = 
        new ConcurrentHashMapWithInit<String, HashSet<String>>() {
    private static final long serialVersionUID = 42L;

    @Override
    protected HashSet<String> initialValue() {
      return new HashSet<String>();
    }
  };
  map.initIfAbsent("s1").add("chao");
  map.initIfAbsent("s2").add("bye");
  System.out.println(map.toString());
}

My generic approximation:

public class ConcurrentHashMapWithInit<K, V> extends ConcurrentHashMap<K, V> {
  private static final long serialVersionUID = 42L;

  public V initIfAbsent(final K key) {
    V value = get(key);
    if (value == null) {
      value = initialValue();
      final V x = putIfAbsent(key, value);
      value = (x != null) ? x : value;
    }
    return value;
  }

  protected V initialValue() {
    return null;
  }
}

And as example of use:

public static void main(final String[] args) throws Throwable {
  ConcurrentHashMapWithInit<String, HashSet<String>> map = 
        new ConcurrentHashMapWithInit<String, HashSet<String>>() {
    private static final long serialVersionUID = 42L;

    @Override
    protected HashSet<String> initialValue() {
      return new HashSet<String>();
    }
  };
  map.initIfAbsent("s1").add("chao");
  map.initIfAbsent("s2").add("bye");
  System.out.println(map.toString());
}

回复收藏 0 原文

~没有更多了~