将大量数据缓存到磁盘中
我有一个要求,需要将大量数据缓存在磁盘上。 每当数据库发生更改时,都会从数据库中检索数据并将其缓存在磁盘上。我将有一个后台进程,它不断检查数据库中的缓存数据,并在需要时更新它。
我想知道组织磁盘上缓存数据的最佳方式是什么,以便从缓存中写入和读取数据可以更快。 另一个线程将用于从数据库获取一些新数据并将其缓存在磁盘上。我还需要注意两个线程之间的同步。(一个将更新现有的缓存数据,另一个将新获取的数据写入缓存。) 请建议一种在缓存上组织数据以及线程之间同步的策略。
I have a requirement where huge amount of data needs to be cached on the disk.
Whenever there is a change in the database, the data is retreived from the database and cached on the disk. I will be having a background process which keeps checking my cached data with the data base, and updates it as and when required.
I would like to know what would be the best way to organize the cached data on my disk, so that writing and reading from the cache can be faster.
An another thread would be used to fetch some new data from the db and cache it on the disk. I also need to take care of synchronization between the two threads.(one will be updating the existing cache data, and the other will be writing newly fetched data into the cache.)
Please suggest a strategy for organizing the data on the cache and also synchronization between the threads.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
SQL Server 有一种称为 XML 表的东西。这些表基于位于磁盘中的物理 XML 文件。您可以将磁盘中的 XML 数据映射/链接到 SQL Server 中的表。对于用户来说,这是无缝的,换句话说,他们将这些表视为常规表。
除了关于在磁盘上缓存大量数据的技术/哲学讨论之外,这只是一个想法......
SQL Server has something called XML tables. Those tables are based on physical XML files located in the disk. You can map/link XML data in the disk to a table in SQL Server. For users, it is seamless, in other words they see those tables as a regular tables.
Besides technical/philosophical discussion about caching huge data on the disk, this is just an idea...
您关心数据的一致性吗?关于电源故障?
内存映射文件以及偶尔的刷新可能会满足您的需要
您需要对数据进行索引访问吗?
您可能需要设计一些 B 树实现或 B+树实现。这提供了索引数据的高效检索和更好的块级锁定。
http://code.google.com/p/high-concurrency-btree/
Do you care about the consistancy of the data? on power failures?
Memory mapped files along with occational flushes porbably get want you want
Do you need to have an indexed access to data?
You probably need to design something B-tree implementation or B+tree implementation. which gives efficient retrival of the indexed data and better block level locking.
http://code.google.com/p/high-concurrency-btree/
作为替代答案,我自己的 B+Tree 实现 将巧妙地将其作为完全托管的代码解决(C# ) IDictionary的实现。它是一个单文件键/值存储,是线程安全的并针对并发性进行了优化。它是专门为此目的并提供直写式缓存而从头开始构建的。
As an alternative answer, my own B+Tree implementation will neatly address this as a completely managed code (C#) implementation of an IDictionary<TKey, TValue>. It is a single-file key/value store that is thread-safe and optimized for concurrency. It was built from the ground up expressly for this purpose and for providing write-through caches.