如何以线程安全的方式在 DAO 中缓存信息

发布于 2024-07-25 17:18:12 字数 711 浏览 6 评论 0原文

我经常需要为一些不经常更改的参考数据实现 DAO。 我有时会将其缓存在 DAO 上的集合字段中,以便仅加载一次并在需要时显式更新。

然而,这带来了许多并发问题——如果另一个线程在加载或更新数据时尝试访问数据怎么办?

显然,这可以通过同步数据的 getter 和 setter 来处理 - 但对于大型 Web 应用程序来说,这是相当大的开销。

我提供了一个有缺陷的小例子,说明了我作为稻草人所需要的东西。 请提出实现此目的的替代方法。

public class LocationDAOImpl implements LocationDAO {

private List<Location> locations = null;

public List<Location> getAllLocations() {
    if(locations == null) {
        loadAllLocations();
    }
    return locations;
}

有关更多信息,我正在使用 Hibernate 和 Spring,但此要求适用于许多技术。

一些进一步的想法:

这根本不应该在代码中处理吗?而是让 ehcache 或类似的东西来处理它? 我是否缺少一个常见的模式? 显然有很多方法可以实现这一点,但我从未找到一种简单且可维护的模式。

提前致谢!

I often need to implement DAO's for some reference data that doesn't change very often. I sometimes cache this in collection field on the DAO - so that it is only loaded once and explicitly updated when required.

However this brings in many concurrency issues - what if another thread attempts to access the data while it is loading or being updated.

Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.

I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.

public class LocationDAOImpl implements LocationDAO {

private List<Location> locations = null;

public List<Location> getAllLocations() {
    if(locations == null) {
        loadAllLocations();
    }
    return locations;
}

For further information I'm using Hibernate and Spring but this requirement would apply across many technologies.

Some further thoughts:

Should this not be handled in code at all - instead let ehcache or similar handle it?
Is there a common pattern for this that I'm missing?
There are obviously many ways this can be achieved but I've never found a pattern that is simple and maintainable.

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

樱花落人离去 2024-08-01 17:18:12

最简单、安全的方法是在项目中包含 ehcache 库 并使用它来设置缓存。 这些人已经解决了您可能遇到的所有问题,并且他们使图书馆尽快建成。

The most simple and safe way is to include the ehcache library in your project and use that to setup a cache. These people have solved all the issues you can encounter and they have made the library as fast as possible.

诗酒趁年少 2024-08-01 17:18:12

在我滚动自己的参考数据缓存的情况下,我通常使用 ReadWriteLock 来减少线程争用。 然后,我的每个访问器都采用以下形式:

public PersistedUser getUser(String userName) throws MissingReferenceDataException {
    PersistedUser ret;

    rwLock.readLock().lock();
    try {
        ret = usersByName.get(userName);

        if (ret == null) {
            throw new MissingReferenceDataException(String.format("Invalid user name: %s.", userName));
        }
    } finally {
        rwLock.readLock().unlock();
    }

    return ret;
}

取出写锁的唯一方法是 refresh(),我通常通过 MBean 公开该方法:

public void refresh() {
    logger.info("Refreshing reference data.");
    rwLock.writeLock().lock();
    try {
        usersById.clear();
        usersByName.clear();

        // Refresh data from underlying data source.

    } finally {
        rwLock.writeLock().unlock();
    }
}

顺便说一句,我选择实现自己的缓存,因为

  • :参考数据集合很小,所以我总是可以将它们全部存储在内存中。
  • 我的应用程序需要简单/快速; 我希望尽可能少地依赖外部库。
  • 数据很少更新,并且当更新时,对refresh() 的调用相当快。 因此,我急切地初始化我的缓存(与你的稻草人示例不同),这意味着访问器永远不需要取出写锁。

In situations where I've rolled my own reference data cache, I've typically used a ReadWriteLock to reduce thread contention. Each of my accessors then takes the form:

public PersistedUser getUser(String userName) throws MissingReferenceDataException {
    PersistedUser ret;

    rwLock.readLock().lock();
    try {
        ret = usersByName.get(userName);

        if (ret == null) {
            throw new MissingReferenceDataException(String.format("Invalid user name: %s.", userName));
        }
    } finally {
        rwLock.readLock().unlock();
    }

    return ret;
}

The only method to take out the write lock is refresh(), which I typically expose via an MBean:

public void refresh() {
    logger.info("Refreshing reference data.");
    rwLock.writeLock().lock();
    try {
        usersById.clear();
        usersByName.clear();

        // Refresh data from underlying data source.

    } finally {
        rwLock.writeLock().unlock();
    }
}

Incidentally, I opted for implementing my own cache because:

  • My reference data collections are small so I can always store them all in memory.
  • My app needs to be simple / fast; I want as few dependencies on external libraries as possible.
  • The data is rarely updated and when it is the call to refresh() is fairly quick. Hence I eagerly initialise my caches (unlike in your straw man example), which means accessors never need to take out the write lock.
裂开嘴轻声笑有多痛 2024-08-01 17:18:12

如果您只想快速推出自己的缓存解决方案,请查看关于 JavaSpecialist 的文章,这是对这本书的评论 Java并发实践,作者:Brian Goetz

它讨论了使用 FutureTaskConcurrentHashMap

完成此操作的方式可确保只有一个并发线程触发长时间运行的计算(在您的情况下,是 DAO 中的数据库调用)。

如果需要,您必须修改此解决方案以添加缓存过期。

自己缓存的另一个想法是垃圾收集。 如果不使用 Wea​​kHashMap 作为缓存,那么 GC 将无法在需要时释放缓存使用的内存。 如果您正在缓存不经常访问的数据(但由于难以计算而仍然值得缓存的数据),那么您可能希望在内存不足时使用 Wea​​kHashMap 来帮助垃圾收集器。

If you just want a quick roll-your own caching solution, have a look at this article on JavaSpecialist, which is a review of the book Java Concurrency in Practice by Brian Goetz.

It talks about implementing a basic thread safe cache using a FutureTask and a ConcurrentHashMap.

The way this is done ensures that only one concurrent thread triggers the long running computation (in your case, your database calls in your DAO).

You'd have to modify this solution to add cache expiry if you need it.

The other thought about caching it yourself is garbage collection. Without using a WeakHashMap for your cache, then the GC wouldn't be able to release the memory used by the cache if needed. If you are caching infrequently accessed data (but data that was still worth caching since it is hard to compute), then you might want to help out the garbage collector when running low on memory by using a WeakHashMap.

静赏你的温柔 2024-08-01 17:18:12

如果您的参考数据是不可变的,休眠的二级缓存可能是一个合理的解决方案。

If your reference data is immutable the second level cache of hibernate could be a reasonable solution.

请止步禁区 2024-08-01 17:18:12

显然,这可以通过同步数据的 getter 和 setter 来处理 - 但对于大型 Web 应用程序来说,这是相当大的开销。

我提供了一个有缺陷的小例子,说明了我作为稻草人所需要的东西。 请提出实现此目的的替代方法。

虽然这可能有点正确,但您应该注意,您提供的示例代码当然需要同步,以避免延迟加载位置时出现任何并发问题。 如果该访问器未同步,那么您将遇到:

  • 多个线程同时访问 loadAllLocations() 方法
  • 即使在另一个线程访问完之后,某些线程也可能会进入 loadAllLocations()完成该方法并将结果分配给locations - 在Java内存模型下,不能保证其他线程在没有同步的情况下会看到变量的更改。

使用延迟加载/初始化时要小心,这似乎是一个简单的性能提升,但它可能会导致许多令人讨厌的线程问题。

Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.

I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.

While this might be somewhat true, you should take note that the sample code you've provided certainly needs to be synchronized to avoid any concurrency issues when lazy-loading the locations. If that accessor is not synchronized, then you will have:

  • Multiple threads access the loadAllLocations() method at the same time
  • Some threads may enter loadAllLocations() even after another thread has completed the method and assigned the result to locations - under the Java Memory Model there is no guarantee that other threads will see the change in the variable without synchronization.

Be careful when using lazy loading/initialization, it seems like a simple performance boost but it can cause lots of nasty threading issues.

难以启齿的温柔 2024-08-01 17:18:12

我认为最好不要自己做,因为做好是一件非常困难的事情。 将 EhCache 或 OSCache 与 Hibernate 和 Spring 结合使用是一个更好的主意。

此外,它使您的 DAO 具有状态性,这可能会出现问题。 除了 Spring 为您管理的连接、工厂或模板对象之外,您根本不应该有任何状态。

更新:如果您的参考数据不是太大,并且确实永远不会改变,也许另一种设计是创建枚举并完全放弃数据库。 没有缓存,没有休眠,不用担心。 也许oxbow_lakes的观点值得考虑:也许它可能是一个非常简单的系统。

I think it's best to not do it yourself, because getting it right is a very difficult thing. Using EhCache or OSCache with Hibernate and Spring is a far better idea.

Besides, it makes your DAOs stateful, which might be problematic. You should have no state at all, besides the connection, factory, or template objects that Spring manages for you.

UPDATE: If your reference data isn't too large, and truly never changes, perhaps an alternative design would be to create enumerations and dispense with the database altogether. No cache, no Hibernate, no worries. Perhaps oxbow_lakes' point is worth considering: perhaps it could be a very simple system.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文