我正在尝试扩展 Clojure 语言,将 ACI 保证的引用扩展到 ACID 保证的 dref(持久引用)。 API 只是简单地调用 (dref key value)
,其中 key
是要在底层数据存储中使用的密钥字符串(我当前实现中的 BDB JE ),value
是 dref 应初始化为的对象。如果key
已经存在于数据库中,则使用存储的值。
可以使用相同的密钥创建多个 dref,并且它们需要同步,即,如果一个具有密钥“A”的 dref 参与使用 (ensure)
写入或读取的事务,则所有具有键“A”的其他 dref 必须在事务上同步:必须使用读锁和写锁对涉及这些 dref 的事务施加排序。从更大的意义上来说,虽然内存中可能有多个具有相同键的 dref,但所有具有该键的 dref 都是一个逻辑对象。
出于显而易见的原因,简单地确保使用单个具体内存中 dref 实现此单个逻辑 dref 要容易得多。这样就没有什么需要同步的了。我该怎么做?
显而易见的答案是使用以 key 为键的对象池。然后,Clojure 将调用静态 getInstance(key,value) 方法从池中检索(如果存在),如果不存在,则创建它并填充池。这种方法的问题在于,没有简单的方法让 Clojure 在完成后释放对象。内存泄漏的城市。我必须确保任何对其有强引用的对象都不会被收集,并且它们存在于池中。如果池丢失对仍在使用的逻辑 dref 的引用,那将是灾难性的,因为另一个进程可以使用相同的密钥创建一个新的 dref,并且与具有相同密钥的其他 dref 相比,它在事务上不安全。
因此,我需要某种版本的 WeakHashMap 或使用非强引用的东西(我更喜欢 SoftReference,因为 GC 会更不情愿)。那么:
- 如果我使用
HashMap>
,如何确保在收集条目 (SoftReference) 的值时映射将逐出条目?某种守护线程?
- 如何使 GC 的池线程安全?或者我是否不必担心这一点,因为 GC 在 SoftReference 级别运行,而我的守护线程将是在 Map 级别运行?
- 相关说明:如何确保守护线程正在运行?有没有什么方法可以让它停止而不抛出异常,如果未捕获,整个 JVM 就会崩溃?如果是这样,我如何监控并在需要时启动新的?
I'm trying to extend the Clojure language to extend ACI-guaranteed refs to ACID-guaranteed drefs (durable refs). The API is to simply to call (dref key value)
, where key
is a String of the key to be used in the underlying data store (BDB JE in my current implementation), and value
is the Object that the dref should be initialized to. If key
already exists in the DB, the stored value is used instead.
Multiple drefs can be created with the same key, and they need to be synchronized, i.e. if one dref with key "A" participates in a transaction where it is written or read with an (ensure)
, all other drefs with key "A" must be transactionally synchronized: read-locks and write-locks must be used to impose ordering on transactions involving those drefs. In a larger sense, although there may be more than one in-memory dref with the same key, all of those drefs with that key are a single logical object.
For obvious reasons, it's much easier to simply ensure that this single logical dref is implemented with a single concrete in-memory dref. That way there's nothing to synchronize. How do I do this?
The obvious answer is to use an object pool keyed on key. Then Clojure will call the static getInstance(key,value)
method to retrieve from the pool if it exists, and create it and populate the pool if not. The problem with this approach is that there's no easy way to get Clojure to release the object when it's done. Memory-leak city. I have to ensure that any object with strong references to it will not be collected, and that they exist in the pool. It would be disastrous if the pool loses references to logical drefs that are still in use, since another process could create a new dref with the same key, and it wouldn't be transactionally safe with the other dref with the same key.
So I need some version of the WeakHashMap
or something using not-strong references (I would prefer SoftReference
s for a little more reluctance by the GC). So:
- If I use a
HashMap<String,SoftReference<DRef>>
, how do I ensure that the map will evict entries if the value of the entry (SoftReference) is collected? Some sort of daemon thread?
- How do I make the pool thread-safe for the GC? Or do I not have to worry about that since the GC is operating at the
SoftReference
level and my daemon thread would be the one operating at the Map
level?
- On a related note, how do I make sure that the daemon thread is running? Is there any way that it can stop without throwing an exception that will crash the entire JVM if uncaught? If so, how do I monitor and start a new one if needed?
发布评论
评论(3)
你尝试过谷歌收藏吗?
他们有一个 MapMaker,可以提供具有软/弱键和值的并发哈希映射的变体。
一个问题是弱/软密钥的相等性是同一性,这很烦人,但如果密钥是字符串,则可能不会太多。
我相信其他库也这样做(org.apache.commons.collections 但我从未使用过它们)。
Have you try google-collections?
They have a MapMaker that gives variations of concurrent hash maps with soft/weak keys and values.
One problem is that equality for Weak/Soft key is identity, which is annoying but maybe not too much if the key is a String.
Other libraries does that I believe (org.apache.commons.collections but I never used them).
Psssst...您正在重新创建 Terracotta 的分布式共享对象。 Terracotta 的内部结构看起来与此非常相似,尽管它们(在 DSO 中)依赖于在加载时使用字节码操作来拦截对字段的所有读取和写入,而在 Clojure 中则要容易得多。
如果您想查看 Terracotta 实现,请使用 ClientObjectManager (http://svn.terracotta.org/svn/tc/dso/trunk/code/base/dso-l1/src/com/tc/object/)是管理共享对象的主要客户端类。查看pojoToManaged并浏览TCObjectImpl中的一些相关代码。
1,2) 您可能会找到 Bob Lee 的演讲 虚拟机中的幽灵 有帮助 - 这是我找到的此类内容的最佳参考。 SoftReferences 和 GC(以及终结器)可能有点棘手。
3)谷歌未捕获的异常处理程序...
Psssst....you're recreating Terracotta's distributed shared objects. The internals of Terracotta look very similar to this, although they rely (in DSO) on using bytecode manipulation at load time to intercept all reads and writes to a field whereas in Clojure it's quite a bit easier.
If you want to look at the Terracotta implementation, the ClientObjectManager (http://svn.terracotta.org/svn/tc/dso/trunk/code/base/dso-l1/src/com/tc/object/) is the main client-side class that manages shared objects. Check out the pojoToManaged and look through some of the related code in TCObjectImpl.
1,2) You might find Bob Lee's talk The Ghost in the Virtual Machine to be helpful - it's the best reference I've found for this kind of stuff. SoftReferences and GC (and finalizers) can be kind of tricky.
3) Google for uncaught exception handlers...
简单的答案可能只是一个
Collections.SynchronizedMap(new WeakHashMap())
- 尽管这本身并不能为您提供线程安全的迭代。1) 您可以自己实现
Map
,并委托给ConcurrentHashMap; >
。您可以将SoftReferences
放入ReferenceQueue
中,并使用守护线程从地图中删除引用,或者仅在之前/之后检查您的ReferenceQueue
每个操作(或每个第 n 个操作等)。2)GC只会清空你的引用——你不必担心它会破坏你的地图,所以那里没有线程问题。
3)您可以看看AWT-EventQueue是如何管理的。但是:
- 你的守护进程线程可能足够简单,不会抛出意外的异常
- 如果你担心它,你可以将守护进程线程的核心内容包装在其中,
除非你收到错误(在这种情况下,你会永远运行)有更大的问题。)
The easy answer is probably just a
Collections.SynchronizedMap(new WeakHashMap())
- though that doesn't give you thread-safe iteration by itself.1) You could implement
Map<K, V>
yourself, and delegate to aConcurrentHashMap<K, SoftReference<V> >
. You can place yourSoftReferences
in aReferenceQueue
, and either use a daemon thread to remove references from your map or just check yourReferenceQueue
before/after each operation (or each nth operation, etc).2) The GC will only null out your references - you don't have to worry about it mucking with your map, so no threading concerns there.
3) You could look at how the AWT-EventQueue is managed. But:
-your daemon thread will probably be simple enough to not throw unexpected exceptions
-if you're concerned about it, you could wrap the meat of your daemon thread in
which will run forever unless you get an Error (in which case you've got bigger issues.)