持久(纯功能性)红黑树对磁盘性能的影响
我正在研究实现简单的开源对象时态数据库的最佳数据结构,目前我非常喜欢使用持久红黑树来实现它。
我使用持久数据结构的主要原因首先是尽量减少锁的使用,这样数据库可以尽可能并行。此外,实现 ACID 事务也将变得更加容易,甚至能够抽象数据库以在某种集群上并行工作。 这种方法的伟大之处在于它使得几乎免费地实现时态数据库成为可能。这是非常好的东西,特别是对于网络和数据分析(例如趋势)。
所有这些都非常酷,但我对在磁盘上使用持久数据结构的整体性能有点怀疑。尽管现在有一些非常快的磁盘可用,并且所有写入都可以异步完成,因此响应总是立即的,但我不想在错误的前提下构建所有应用程序,只是意识到这并不是一个很好的选择方法来做到这一点。
这是我的想法: - 由于所有写入都是异步完成的,并且使用持久数据结构将不会使先前的和当前有效的结构失效,因此写入时间并不是真正的瓶颈。 - 有一些关于结构的文献,例如 这个 是完全适合磁盘使用。但在我看来,这些技术会增加更多的读取开销以实现更快的写入。但我认为恰恰相反更好。此外,其中许多技术确实最终得到了多版本树,但它们并不是严格不可变的,这对于证明持久开销的合理性非常重要。 - 我知道在将值附加到数据库时仍然必须有某种锁定,而且我也知道如果不是要维护所有版本,则应该有一个良好的垃圾收集逻辑(否则文件大小肯定会急剧增加) 。还可以考虑增量压缩系统。 - 在所有搜索树结构中,我真的认为红黑树最接近我需要的,因为它们提供的旋转次数最少。
但在此过程中可能存在一些陷阱: - 异步写入可能会影响需要实时数据的应用程序。但我认为大多数时候 Web 应用程序的情况并非如此。此外,当需要实时数据时,可以设计另一种解决方案,例如需要以更实时的方式工作的特定数据的签入/签出系统。 - 它们也可能导致一些提交冲突,尽管我无法想到何时会发生这种情况的一个很好的例子。如果两个线程正在处理相同的数据,那么在普通 RDBMS 中也可能会发生提交冲突,对吗? - 像这样的不可变接口的开销将呈指数级增长,一切都注定很快就会失败,所以这一切都是一个坏主意。
有什么想法吗?
谢谢!
编辑: 对于什么是持久数据结构似乎存在误解:
I'm studying the best data structures to implement a simple open-source object temporal database, and currently I'm very fond of using Persistent Red-Black trees to do it.
My main reasons for using persistent data structures is first of all to minimize the use of locks, so the database can be as parallel as possible. Also it will be easier to implement ACID transactions and even being able to abstract the database to work in parallel on a cluster of some kind.
The great thing of this approach is that it makes possible implementing temporal databases almost for free. And this is something quite nice to have, specially for web and for data analysis (e.g. trends).
All of this is very cool, but I'm a little suspicious about the overall performance of using a persistent data structure on disk. Even though there are some very fast disks available today, and all writes can be done asynchronously, so a response is always immediate, I don't want to build all application under a false premise, only to realize it isn't really a good way to do it.
Here's my line of thought:
- Since all writes are done asynchronously, and using a persistent data structure will enable not to invalidate the previous - and currently valid - structure, the write time isn't really a bottleneck.
- There are some literature on structures like this that are exactly for disk usage. But it seems to me that these techniques will add more read overhead to achieve faster writes. But I think that exactly the opposite is preferable. Also many of these techniques really do end up with a multi-versioned trees, but they aren't strictly immutable, which is something very crucial to justify the persistent overhead.
- I know there still will have to be some kind of locking when appending values to the database, and I also know there should be a good garbage collecting logic if not all versions are to be maintained (otherwise the file size will surely rise dramatically). Also a delta compression system could be thought about.
- Of all search trees structures, I really think Red-Blacks are the most close to what I need, since they offer the least number of rotations.
But there are some possible pitfalls along the way:
- Asynchronous writes -could- affect applications that need the data in real time. But I don't think that is the case with web applications, most of the time. Also when real-time data is needed, another solutions could be devised, like a check-in/check-out system of specific data that will need to be worked on a more real-time manner.
- Also they could lead to some commit conflicts, though I fail to think of a good example of when it could happen. Also commit conflicts can occur in normal RDBMS, if two threads are working with the same data, right?
- The overhead of having an immutable interface like this will grow exponentially and everything is doomed to fail soon, so this all is a bad idea.
Any thoughts?
Thanks!
edit:
There seems to be a misunderstanding of what a persistent data structure is:
http://en.wikipedia.org/wiki/Persistent_data_structure
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您发现写入时间遇到瓶颈,或者如果没有同步写入,持久性保证就毫无意义(嗯...),您应该执行大多数其他数据库所做的操作:实现 预写日志 (WAL),或重做日志。
磁盘实际上非常擅长顺序写入,或者至少这是它们最擅长的。这是随机写入(例如树中的写入),速度非常慢。即使闪存驱动器在随机写入方面击败了磁盘,但在顺序写入方面仍然明显更好。实际上,即使大多数 RAM 也更擅长顺序写入,因为涉及的控制信号较少。
通过使用预写日志,您不必担心:
If you find you are getting bottlenecked on write time, or that your durability guarantee is meaningless without synchronous writes (hmm...), you should do what most other databases do: implement a Write-Ahead Log (WAL), or a redo-log.
Disks are actually pretty darn good at writing sequentially, or at least that's what they're best at. It's random writes (such as those in a tree) that are terribly slow. Even flash drives, which beat the hell out of disks for random writes, are still significantly better at sequential writes. Actually, even most RAM is better at sequential writes because there are fewer control signals involved.
By using a write-ahead log, you don't have to worry about:
我的想法是你有个好主意。现在去建造那个该死的东西吧。从您所写的所有内容来看,您似乎患有严重的分析瘫痪。
My thought is that you have a great idea. Now go build the darn thing. From everything you've written, it sounds like you're suffering from an acute case of analysis paralysis.
我知道这个问题有点老了,但我一直在实现几乎相同的事情,我发现,作为二叉树意味着性能很糟糕(由于搜索次数)。尽管有额外的空间开销,但尝试创建一个更广泛的持久树可能是一个更好的主意。
I know this question is a little old, but I've been implementing the almost the same thing and what I've found is that, being a binary tree means that the performance is terrible (due to the number of seeks). It is probably a much better idea to try to make a much broader persistent tree despite the extra space overhead.
与志趣相投的人感兴趣:-) 我实际上已经实现了一个使用持久数据结构作为其数据模型的数据库。我想人们可以称其为一种持久 B2 树。仅附加存储到磁盘和垃圾收集 - 并非所有历史记录都需要永远保留。人们可以设置一个有限的保留期,以允许数据库忘记早期的历史记录。
请参阅http://bergdb.com/
Interesting with someone likeminded :-) I have actually implemented a database that uses a persistant data structure as its data model. A type of persistent B2-tree, I suppose one could call it. Append-only storage to disk and garbage collection - not all history need to be kept forever. One can set a finite retain period to allow the database to forget about early history.
See http://bergdb.com/