当前位置：文江博客话题详情

寻找 java.util.Map 的直接替代品

发布于 2024-10-12 18:17:34 字数 2212 浏览 8 评论 0 原文

/

跟进此问题< a>，似乎基于文件或基于磁盘的 Map 实现可能是我在那里提到的问题的正确解决方案。简短版本：

现在，我有一个作为 ConcurrentHashMap 实现的 Map。
条目会以相当固定的速度不断添加到其中。稍后详细介绍。
最终，无论如何，这都意味着 JVM 耗尽了堆空间。

在工作中，有人（强烈）建议我使用 SQLite 解决这个问题，但是在问了上一个问题之后，我认为数据库不是完成这项工作的正确工具。所以 - 如果这听起来很疯狂，请告诉我 - 我认为更好的解决方案是将 Map 存储在磁盘上。

坏主意：我自己实现这个。更好的主意：使用别人的图书馆！ 哪一个？

要求

必备条件：

免费。
持久。数据需要在 JVM 重新启动之间保留。
某种可搜索性。是的，我需要能够检索这些该死的数据并将其存放起来。基本的结果集过滤是一个优点。
独立于平台。需要可在 Windows 或 Linux 计算机上进行生产部署。
可清除。磁盘空间是有限的，就像堆空间一样。我需要删除 n 天前的条目。如果我必须手动执行此操作，那没什么大不了的。

必备要素：

易于使用。如果我能在本周末之前完成这项工作，那就太好了。
更好的是：一天结束了。如果我可以在类路径中添加一个 JAR，将 new ConcurrentHashMap(); 更改为 new SomeDiskStoredMap();
并完成。
良好的可扩展性和性能。最坏的情况：新条目（平均）每秒添加 3 次，每秒、全天、每天。然而，插入并不总是那么顺利。可能是（一小时内没有插入），然后是（一次插入 10,000 个对象）。

可能的解决方案

Ehcache？我以前从未使用过它。这是建议我之前的问题的解决方案。
Berkeley DB？再说一次，我从来没有使用过它，我真的对此一无所知。
Hadoop （以及哪个子项目）？没用过。基于这些文档，其跨平台准备情况不明确我。在可预见的将来我不需要分布式操作。
毕竟是SQLite JDBC 驱动程序吗？
???

Ehcache 和 Berkeley DB 现在看起来都很合理。在这两个方向上有什么特别的建议吗？

原文

Problem

Following up on this question, it seems that a file- or disk-based Map implementation may be the right solution to the problems I mentioned there. Short version:

Right now, I have a Map implemented as a ConcurrentHashMap.
Entries are added to it continually, at a fairly fixed rate. Details on this later.
Eventually, no matter what, this means the JVM runs out of heap space.

At work, it was (strongly) suggested that I solve this problem using SQLite, but after asking that previous question, I don't think that a database is the right tool for this job. So - let me know if this sounds crazy - I think a better solution would be a Map stored on disk.

Bad idea: implement this myself. Better idea: use someone else's library! Which one?

Requirements

Must-haves:

Free.
Persistent. The data needs to stick around between JVM restarts.
Some sort of searchability. Yes, I need the ability to retrieve this darn data as well as put it away. Basic result set filtering is a plus.
Platform-independent. Needs to be production-deployable on Windows or Linux machines.
Purgeable. Disk space is finite, just like heap space. I need to get rid of entries that are n days old. It's not a big deal if I have to do this manually.

Nice-to-haves:

Easy to use. It would be great if I could get this working by the end of the week.
Better still: the end of the day. It would be really, really great if I could add one JAR to my classpath, change new ConcurrentHashMap<Foo, Bar>(); to new SomeDiskStoredMap<Foo, Bar>();
and be done.
Decent scalability and performance. Worst case: new entries are added (on average) 3 times per second, every second, all day long, every day. However, inserts won't always happen that smoothly. It might be (no inserts for an hour) then (insert 10,000 objects at once).

Possible Solutions

Ehcache? I've never used it before. It was a suggested solution to my previous question.
Berkeley DB? Again, I've never used it, and I really don't know anything about it.
Hadoop (and which subproject)? Haven't used it. Based on these docs, its cross-platform-readiness is ambiguous to me. I don't need distributed operation in the foreseeable future.
A SQLite JDBC driver after all?
???

Ehcache and Berkeley DB both look reasonable right now. Any particular recommendations in either direction?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

哆兒滾 2024-10-19 18:17:34

更新（第一次发布后大约 4 年......）：请注意，在较新版本的 ehcache 中，缓存项的持久性仅在付费产品中可用。感谢@boday 指出了这一点。

ehcache 很棒。它将为您提供在内存、磁盘或溢出到磁盘的内存中实现映射所需的灵活性。如果您使用 java.util.Map 这个非常简单的包装器，那么使用它就非常简单：

import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.Set;

import net.sf.ehcache.Cache;
import net.sf.ehcache.Element;

import org.apache.log4j.Logger;

import com.google.common.collect.Sets;

public class EhCacheMapAdapter<K,V> implements Map<K,V> {
    @SuppressWarnings("unused")
    private final static Logger logger = Logger
            .getLogger(EhCacheMapAdapter.class);

    public Cache ehCache;

    public EhCacheMapAdapter(Cache ehCache) {
        super();
        this.ehCache = ehCache;
    } // end constructor

    @Override
    public void clear() {
        ehCache.removeAll();
    } // end method

    @Override
    public boolean containsKey(Object key) {
        return ehCache.isKeyInCache(key);
    } // end method

    @Override
    public boolean containsValue(Object value) {
        return ehCache.isValueInCache(value);
    } // end method

    @Override
    public Set<Entry<K, V>> entrySet() {
        throw new UnsupportedOperationException();
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public V get(Object key) {
        if( key == null ) return null;
        Element element = ehCache.get(key);
        if( element == null ) return null;
        return (V)element.getObjectValue();
    } // end method

    @Override
    public boolean isEmpty() {
        return ehCache.getSize() == 0;
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public Set<K> keySet() {
        List<K> l = ehCache.getKeys();
        return Sets.newHashSet(l);
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public V put(K key, V value) {
        Object o = this.get(key);
        if( o != null ) return (V)o;
        Element e = new Element(key,value);
        ehCache.put(e);
        return null;
    } // end method


    @Override
    public V remove(Object key) {
        V retObj = null;
        if( this.containsKey(key) ) {
            retObj = this.get(key);
        } // end if
        ehCache.remove(key);
        return retObj;
    } // end method

    @Override
    public int size() {
        return ehCache.getSize();
    } // end method

    @Override
    public Collection<V> values() {
        throw new UnsupportedOperationException();
    } // end method

    @Override
    public void putAll(Map<? extends K, ? extends V> m) {
        for( K key : m.keySet() ) {
            this.put(key, m.get(key));
        } // end for
    } // end method
} // end class

UPDATE (some 4 years after first post...): beware that in newer versions of ehcache, persistence of cache items is available only in the pay product. Thanks @boday for pointing this out.

ehcache is great. It will give you the flexibility you need to implement the map in memory, disk or memory with spillover to disk. If you use this very simple wrapper for java.util.Map then using it is blindingly simple:

import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.Set;

import net.sf.ehcache.Cache;
import net.sf.ehcache.Element;

import org.apache.log4j.Logger;

import com.google.common.collect.Sets;

public class EhCacheMapAdapter<K,V> implements Map<K,V> {
    @SuppressWarnings("unused")
    private final static Logger logger = Logger
            .getLogger(EhCacheMapAdapter.class);

    public Cache ehCache;

    public EhCacheMapAdapter(Cache ehCache) {
        super();
        this.ehCache = ehCache;
    } // end constructor

    @Override
    public void clear() {
        ehCache.removeAll();
    } // end method

    @Override
    public boolean containsKey(Object key) {
        return ehCache.isKeyInCache(key);
    } // end method

    @Override
    public boolean containsValue(Object value) {
        return ehCache.isValueInCache(value);
    } // end method

    @Override
    public Set<Entry<K, V>> entrySet() {
        throw new UnsupportedOperationException();
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public V get(Object key) {
        if( key == null ) return null;
        Element element = ehCache.get(key);
        if( element == null ) return null;
        return (V)element.getObjectValue();
    } // end method

    @Override
    public boolean isEmpty() {
        return ehCache.getSize() == 0;
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public Set<K> keySet() {
        List<K> l = ehCache.getKeys();
        return Sets.newHashSet(l);
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public V put(K key, V value) {
        Object o = this.get(key);
        if( o != null ) return (V)o;
        Element e = new Element(key,value);
        ehCache.put(e);
        return null;
    } // end method


    @Override
    public V remove(Object key) {
        V retObj = null;
        if( this.containsKey(key) ) {
            retObj = this.get(key);
        } // end if
        ehCache.remove(key);
        return retObj;
    } // end method

    @Override
    public int size() {
        return ehCache.getSize();
    } // end method

    @Override
    public Collection<V> values() {
        throw new UnsupportedOperationException();
    } // end method

    @Override
    public void putAll(Map<? extends K, ? extends V> m) {
        for( K key : m.keySet() ) {
            this.put(key, m.get(key));
        } // end for
    } // end method
} // end class

回复收藏 0 原文

你在看孤独的风景 2024-10-19 18:17:34

您从未听说过流行框架吗？

编辑对该术语的一些澄清。

就像 James Gosling 现在所说的那样，没有任何 SQL DB 比内存存储更高效。流行框架（最知名的是prevayler 和 space4j）是建立在这种in-内存，也许可以存储在磁盘上，存储。它们如何工作？事实上，它看似简单：存储对象包含所有持久实体。该存储只能通过可序列化的操作来更改。因此，将对象放入存储中是一个 Put 操作在孤立的环境中进行。由于此操作是可序列化的，因此它也可以（取决于配置）保存在磁盘上以实现长期持久性。然而，主要的数据存储库是内存，它无疑提供了快速的访问时间，但代价是高内存使用率。

另一个优点是，由于其明显的简单性，这些框架几乎不包含超过十分之一的类

考虑到您的问题，使用我立即想到了 Space4J（因为它提供了对很少使用的对象的“钝化”支持，也就是说它们的索引键位于内存中，但只要不使用这些对象就会保留在磁盘上）。

请注意，您还可以在 c2wiki 找到一些信息。

回复收藏 0 原文

不回头走下去 2024-10-19 18:17:34

Berkeley DB Java 版有一个 Collections API。在该 API 中，尤其是 StoredMap，它是 ConcurrentHashMap 的直接替代品。在创建 StoredMap 之前，您需要创建环境和数据库，但是集合教程应该会让这一切变得非常简单。

根据您的要求，Berkeley DB 被设计为易于使用，我认为您会发现它具有卓越的可扩展性和性能。 Berkeley DB 可在开源许可下使用，它是持久的、独立于平台的，并且允许您搜索数据。当然可以根据需要清除/删除数据。 Berkeley DB 拥有一长串其他功能，您可能会发现这些功能对您的应用程序非常有用，尤其是当您的需求随着应用程序的成功而变化和增长时。

如果您决定使用 Berkeley DB Java 版，请务必在 BDB JE 论坛上提问。有一个活跃的开发者社区，很乐意帮助回答问题和解决问题。

回复收藏 0 原文

静谧幽蓝 2024-10-19 18:17:34

我们使用 Xapian 实现了类似的解决方案。它速度快，可扩展，几乎提供了您所要求的所有搜索功能，它是免费的、多平台的，当然还可以清除。

回复收藏 0 原文

醉生梦死 2024-10-19 18:17:34

几周前我遇到了 jdbm2。使用方法非常简单。你应该能够在半小时内让它工作。一个缺点是放入映射中的对象必须是可序列化的，即实现Serialized。其他缺点在他们的网站上给出。

然而，所有的对象持久化数据库并不是存储你自己的java类的对象的永久解决方案。如果您决定更改该类的字段，您将无法再从地图集合中检索该对象。非常适合存储标准可序列化类行 String、Integer 等。

回复收藏 0 原文

决绝 2024-10-19 18:17:34

google-collections 库，http://code.google.com/p/guava- 的一部分Librarys/，有一些非常有用的地图工具。 MapMaker 尤其可以让您使用定时驱逐、软值（如果堆用完时将被垃圾收集器清除）和计算函数来创建并发 HashMap。

Map<String, String> cache = new MapMaker()
    .softValues()
    .expiration(30, TimeUnit.MINUTES)
    .makeComputingMap(new Function<String, String>() {
        @Override
        public String apply(String input) {
            // Work out what the value should be
            return null;
        }
    });

这将为您提供一个地图缓存，该缓存将自行清理并计算出其值。如果您能够计算出这样的值那就太好了，否则它将完美映射到 http://redis.io/ 你要写入的内容（公平地说，redis 本身可能就足够快了！）。

The google-collections library, part of http://code.google.com/p/guava-libraries/, has some really useful Map tools. MapMaker in particular lets you make concurrent HashMaps with timed evictions, soft values that will be swept up by the garbage collector if you're running out of heap, and computing functions.

Map<String, String> cache = new MapMaker()
    .softValues()
    .expiration(30, TimeUnit.MINUTES)
    .makeComputingMap(new Function<String, String>() {
        @Override
        public String apply(String input) {
            // Work out what the value should be
            return null;
        }
    });

That will give you a Map cache that will clean up after itself and can work out its values. If you're able to compute values like that then great, otherwise it would map perfectly onto http://redis.io/ which you'd be writing into (to be fair, redis would probably be fast enough on its own!).

回复收藏 0 原文

~没有更多了~

关于作者

陈年往事

暂无简介

0 文章

0 评论

25 人气

关注发私信

友情链接

文江博客

寻找 java.util.Map 的直接替代品

/

要求

必备条件：

必备要素：

可能的解决方案

Problem

Requirements

Must-haves:

Nice-to-haves:

Possible Solutions

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

寻找 java.util.Map 的直接替代品

/

要求

必备条件：

必备要素：

可能的解决方案

Problem

Requirements

Must-haves:

Nice-to-haves:

Possible Solutions

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。