如何以线程安全的方式在 DAO 中缓存信息
我经常需要为一些不经常更改的参考数据实现 DAO。 我有时会将其缓存在 DAO 上的集合字段中,以便仅加载一次并在需要时显式更新。
然而,这带来了许多并发问题——如果另一个线程在加载或更新数据时尝试访问数据怎么办?
显然,这可以通过同步数据的 getter 和 setter 来处理 - 但对于大型 Web 应用程序来说,这是相当大的开销。
我提供了一个有缺陷的小例子,说明了我作为稻草人所需要的东西。 请提出实现此目的的替代方法。
public class LocationDAOImpl implements LocationDAO {
private List<Location> locations = null;
public List<Location> getAllLocations() {
if(locations == null) {
loadAllLocations();
}
return locations;
}
有关更多信息,我正在使用 Hibernate 和 Spring,但此要求适用于许多技术。
一些进一步的想法:
这根本不应该在代码中处理吗?而是让 ehcache 或类似的东西来处理它? 我是否缺少一个常见的模式? 显然有很多方法可以实现这一点,但我从未找到一种简单且可维护的模式。
提前致谢!
I often need to implement DAO's for some reference data that doesn't change very often. I sometimes cache this in collection field on the DAO - so that it is only loaded once and explicitly updated when required.
However this brings in many concurrency issues - what if another thread attempts to access the data while it is loading or being updated.
Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.
I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.
public class LocationDAOImpl implements LocationDAO {
private List<Location> locations = null;
public List<Location> getAllLocations() {
if(locations == null) {
loadAllLocations();
}
return locations;
}
For further information I'm using Hibernate and Spring but this requirement would apply across many technologies.
Some further thoughts:
Should this not be handled in code at all - instead let ehcache or similar handle it?
Is there a common pattern for this that I'm missing?
There are obviously many ways this can be achieved but I've never found a pattern that is simple and maintainable.
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
最简单、安全的方法是在项目中包含 ehcache 库 并使用它来设置缓存。 这些人已经解决了您可能遇到的所有问题,并且他们使图书馆尽快建成。
The most simple and safe way is to include the ehcache library in your project and use that to setup a cache. These people have solved all the issues you can encounter and they have made the library as fast as possible.
在我滚动自己的参考数据缓存的情况下,我通常使用 ReadWriteLock 来减少线程争用。 然后,我的每个访问器都采用以下形式:
取出写锁的唯一方法是
refresh()
,我通常通过 MBean 公开该方法:顺便说一句,我选择实现自己的缓存,因为
In situations where I've rolled my own reference data cache, I've typically used a
ReadWriteLock
to reduce thread contention. Each of my accessors then takes the form:The only method to take out the write lock is
refresh()
, which I typically expose via an MBean:Incidentally, I opted for implementing my own cache because:
如果您只想快速推出自己的缓存解决方案,请查看此关于 JavaSpecialist 的文章,这是对这本书的评论 Java并发实践,作者:Brian Goetz。
它讨论了使用 FutureTask 和 ConcurrentHashMap。
完成此操作的方式可确保只有一个并发线程触发长时间运行的计算(在您的情况下,是 DAO 中的数据库调用)。
如果需要,您必须修改此解决方案以添加缓存过期。
自己缓存的另一个想法是垃圾收集。 如果不使用 WeakHashMap 作为缓存,那么 GC 将无法在需要时释放缓存使用的内存。 如果您正在缓存不经常访问的数据(但由于难以计算而仍然值得缓存的数据),那么您可能希望在内存不足时使用 WeakHashMap 来帮助垃圾收集器。
If you just want a quick roll-your own caching solution, have a look at this article on JavaSpecialist, which is a review of the book Java Concurrency in Practice by Brian Goetz.
It talks about implementing a basic thread safe cache using a FutureTask and a ConcurrentHashMap.
The way this is done ensures that only one concurrent thread triggers the long running computation (in your case, your database calls in your DAO).
You'd have to modify this solution to add cache expiry if you need it.
The other thought about caching it yourself is garbage collection. Without using a WeakHashMap for your cache, then the GC wouldn't be able to release the memory used by the cache if needed. If you are caching infrequently accessed data (but data that was still worth caching since it is hard to compute), then you might want to help out the garbage collector when running low on memory by using a WeakHashMap.
如果您的参考数据是不可变的,休眠的二级缓存可能是一个合理的解决方案。
If your reference data is immutable the second level cache of hibernate could be a reasonable solution.
虽然这可能有点正确,但您应该注意,您提供的示例代码当然需要同步,以避免延迟加载
位置
时出现任何并发问题。 如果该访问器未同步,那么您将遇到:loadAllLocations()
方法loadAllLocations()
完成该方法并将结果分配给locations
- 在Java内存模型下,不能保证其他线程在没有同步的情况下会看到变量的更改。使用延迟加载/初始化时要小心,这似乎是一个简单的性能提升,但它可能会导致许多令人讨厌的线程问题。
While this might be somewhat true, you should take note that the sample code you've provided certainly needs to be synchronized to avoid any concurrency issues when lazy-loading the
locations
. If that accessor is not synchronized, then you will have:loadAllLocations()
method at the same timeloadAllLocations()
even after another thread has completed the method and assigned the result tolocations
- under the Java Memory Model there is no guarantee that other threads will see the change in the variable without synchronization.Be careful when using lazy loading/initialization, it seems like a simple performance boost but it can cause lots of nasty threading issues.
我认为最好不要自己做,因为做好是一件非常困难的事情。 将 EhCache 或 OSCache 与 Hibernate 和 Spring 结合使用是一个更好的主意。
此外,它使您的 DAO 具有状态性,这可能会出现问题。 除了 Spring 为您管理的连接、工厂或模板对象之外,您根本不应该有任何状态。
更新:如果您的参考数据不是太大,并且确实永远不会改变,也许另一种设计是创建枚举并完全放弃数据库。 没有缓存,没有休眠,不用担心。 也许oxbow_lakes的观点值得考虑:也许它可能是一个非常简单的系统。
I think it's best to not do it yourself, because getting it right is a very difficult thing. Using EhCache or OSCache with Hibernate and Spring is a far better idea.
Besides, it makes your DAOs stateful, which might be problematic. You should have no state at all, besides the connection, factory, or template objects that Spring manages for you.
UPDATE: If your reference data isn't too large, and truly never changes, perhaps an alternative design would be to create enumerations and dispense with the database altogether. No cache, no Hibernate, no worries. Perhaps oxbow_lakes' point is worth considering: perhaps it could be a very simple system.