在没有竞争的情况下在 ConcurrentMultimap 上实现删除
我一直在研究编写并发 Multimap,并且我有一个由 Google Guava< /a> AbstractSetMultimap 和 MapMaker 计算映射,根据需要创建值集合作为 ConcurrentHashMap 上的集合视图。通过对视图集合和各种包装器的一些关注,我认为这已经非常接近了。
这个大问题已经讨论过 作者:其他 已经尝试过这个,似乎是当值集合变空时从底层映射中删除值集合,而不需要引入竞争条件。
似乎存在几个选项。
- 将空集合留在那里。这会泄露一些 CHM,但我相信它至少是正确的。
- 尝试乐观地在空集合时删除集合,并在集合中出现其他内容时进行补偿。这充满了竞争,而且似乎本质上是不可能解决的。
- 同步值集合上的所有内容,这至少允许删除,但代价是在按键进行初始查找后出现任何并发性。
- 对于较小的惩罚(也许,取决于使用模式?),也许同步值集合的创建和删除,需要检查是否涵盖了所有内容。
问题:
- 有谁知道比这更好的实现吗?我们可以更好地组合 MapMaker 的位,还是需要从头开始编写专门的 ConcurrentHashMultimap?
- 如果在这方面很难做出很大的改进,那么这种泄漏在实践中可能会成为一个大问题吗?值得注意的集合(例如 java.util.HashMap、juc.ConcurrentHashMap 和 ArrayDeque)不会向下调整后备存储的大小,ArrayList 也不会自动这样做。只要我们把这些东西清理掉,我想这会不会太重要了。
谢谢
编辑:另请参阅此处的讨论< /a> 在番石榴邮件列表上。
编辑2:我已经写下了这个。请参阅此 Google 代码区了解具体实现。我非常感谢任何尝试过它的人的反馈,无论是在那里而不是在这里。
I've been looking at the problem of writing a concurrent Multimap, and I have an implementation backed by the Google Guava AbstractSetMultimap and a MapMaker computing map that creates on demand the values-collections as a set view over a ConcurrentHashMap. With some care over the view collections and various wrappers, I think this gets pretty close.
The big problem, which has already been discussed by others who have tried this, appears to be that of removing the values-collections from the underlying map when they become empty, without introducing race conditions.
A couple of options appear to exist.
- leave the empty collections there. This will leak some CHMs, but I believe it to be at least correct.
- try optimistically to remove the collection when empty and compensate if anything else appears in it. This is full of races and seems intrinsically impossible to fix.
- synchronise everything on the values-collection, which would at least allow this removal, but at the cost of any concurrency after the initial lookup by key.
- for a smaller penalty (perhaps, depending on usage patterns?), perhaps synchronise on values-collection creation and removal, need to check whether that covers everything.
Questions:
- Does anyone know of any better implementation than this? Can we do better composing bits of MapMaker, or does this need a specialised ConcurrentHashMultimap written from scratch?
- If it's difficult to improve much on this, is this leak likely to be much of an issue in practice? Notable collections such as java.util.HashMap, juc.ConcurrentHashMap and ArrayDeque do not resize the backing store downwards, and ArrayList does not do so automatically. As long as we clear out the objects, I wonder whether this will matter too much.
Thanks
Edit: see also the discussion here on the guava mailing list.
Edit 2: I've since written this up. Please see this Google code area for an implementation. I would greatly appreciate any feedback from anyone who tries it, there rather than here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我之前问过同样的问题,最终实现了 4 种不同的实现。
问题:
高性能并发MultiMap Java/Scala
impl(我称之为索引)
http:// github.com/jboner/akka/blob/master/akka-actor/src/main/scala/actor/ActorRegistry.scala#L314
I asked the same question earlier, and ended up implementing 4 different implementations.
The question:
High-performance Concurrent MultiMap Java/Scala
The impl (I call it Index)
http://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/actor/ActorRegistry.scala#L314
作为后续内容,以下是我在之前的讨论中省略的一些细节(您链接到了有关我的并发多重映射实现的内容)。
该实现遵循您的第一个建议:在支持映射中保留空集合。以下实时查看行为使您的其他建议变得复杂。
对于现实世界的应用程序,与集合库类相反,少于完全并发的多重映射的东西可能就足够了。您可以定义自己的类来实现 Multimap 接口的子集或并发保证的有限选择。或者,您的应用程序逻辑可以最大限度地减少同步多重映射的争用,以避免性能问题。
As a follow-up, here are some details I omitted from the earlier discussions, which you linked to, about my concurrent multimap implementation.
That implementation followed your first suggestion: leaving empty collections in the backing map. The following live-view behavior complicates your other suggestions.
For a real-world application, as opposed to a collection library class, something less than a fully concurrent multimap would probably suffice. You could define your own class that implements a subset of the Multimap interface or a limited selection of concurrency guarantees. Or, your application logic could minimize contention of a synchronized multimap to avoid performance problems.
如果您确实不想泄漏空集合,您可以尝试用每个键占位符 Future 以原子方式替换它。这样,并发添加/删除或添加/添加在再次扩展时应该能够达到一致的状态。
If you really don't want to leak the empty collection you could try to atomically replace it with a per-key placeholder Future. That way concurrent add/remove or add/add should be able to reach consistent state when expanding again.
使用不可变集合作为值是解决/简化基本并发问题的最佳方法,因为您可以使用原子替换方法将其删除。不幸的是,不存在具有一般用途的快速复制/更新配置文件的不可变集合,因此您通常需要执行相当昂贵的复制所有内容。
Using immutable collections as the values is the best way to solve/simplify the basic concurrency issues as you can then remove with the atomic replace method. Unfortunately, there aren't immutable collections with fast copy/update profiles in general use, so you usually need to do quite expensive copy everything.