寻找磁盘绑定的 B 树示例
也许我的 google-foo 只是不符合要求,但我想使用绑定到磁盘的 b 树算法。由于大多数教程和示例都是内存上的,因此它们假设随机存取内存,其中更改树中的节点非常简单,但除了 I/O 密集型重写或使用内存映射文件之外,我想不出一个好的方法方法。
理论就可以,C# 或 Java 会更好。
编辑:对于缺乏清晰度,我深表歉意。我并不是在寻找可供使用的产品或代码库,而是在寻找示例或说明性代码库,以更好地理解如何构建磁盘支持的 B 树。
Maybe my google-foo just isn't up to snuff, but I'm wanting to play with a b-tree alogrithm that is bound to disk. Since most tutorials and examples are on-memory, they assume random access memory in which changing nodes in a tree is simple enough, but other than rather I/O intensive rewriting or using memory-mapped files, I can't think of a good approach.
Theory would be fine, C# or Java would be even better.
EDIT: I apologize for the lack of clarity. I am not looking for a product or code base to use, but an example or illustrative codebase to better understand how one would go about building a disk-backed b-tree.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Tokyo Cabinet(或京都 Cabinet)是最快的键值数据库之一(还包含与 B+Tree 一起使用的键值数据库)。我在对 B+Tree 进行基准测试时使用过它,并且代码很容易理解。它是用 C 编写的,但也有 Java 绑定......
东京内阁:
http://fallabs.com/tokyocabinet/
Berkley DB 也可以使用 B+Tree。然而,当我对 Berkley DB 进行基准测试时,它与 Tokyo Cabinet 相比非常慢......
http://www.oracle.com/technetwork/database/berkeleydb/overview /index.html
One of the fastest Key Value DB (which also contains a Key Value DB which works with B+Trees) is Tokyo Cabinet (or Kyoto Cabinet) :). I have worked with it when I was benchmarking B+Trees and the code is easy to understand. It's written in C but it has also Java bindings...
Tokyo Cabinet:
http://fallabs.com/tokyocabinet/
And Berkley DB also works with a B+Tree. However, when I was benchmarking Berkley DB, it was very slow compared to Tokyo Cabinet though...
http://www.oracle.com/technetwork/database/berkeleydb/overview/index.html
首先,看上面的第2、3、4和4号。 谷歌。
其次,请参阅此 stackoverflow 线程 具有非常相似的问题。
第三,如果您使用 MSSQL 作为示例,您可以在此处阅读一些内容 并按此处所述可视化页面(就像缓存行拆分一样,尽量减少这样的分裂)。例如,MSSQL 对可索引的数据施加了大小限制,即 8k = 页大小。
第四,查看我必须问的问题的答案才能提供这个答案此处
或者,您可以使用十六进制编辑器查看数据库文件并查看事物是如何映射的,但这是极端的。
Firstly, see top the 2nd,3rd,4th & 5th results from google.
Secondly, see this stackoverflow thread with very similar question.
Thirdly, if you use MSSQL as an example you can read some stuff here and visualize the pages as described here (just like cache line split it's important to minimize such splits). Also MSSQL for example imposes a size limit of the data that can be indexed which is 8k = to the page size.
Fourthly, see the answer of my question which I had to ask just be able to provide this answer here
Alternatively you can use a hex editor to view the database files and see how things are mapped but that's extreme.