AVL 树与 B 树
AVL 树与 B 树有何不同?
How is an AVL tree different from a B-tree?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
AVL 树与 B 树有何不同?
How is an AVL tree different from a B-tree?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(10)
AVL 树旨在用于内存中使用,其中随机访问相对便宜。 B 树更适合磁盘支持的存储,因为它们将大量的键分组到每个节点中,以最大限度地减少读取或写入操作所需的查找次数。 (这就是为什么 B 树经常用于文件系统和数据库,例如 SQLite。)
AVL trees are intended for in-memory use, where random access is relatively cheap. B-trees are better suited for disk-backed storage, because they group a larger number of keys into each node to minimize the number of seeks required by a read or write operation. (This is why B-trees are often used in file systems and databases, such as SQLite.)
AVL 树和 B 树的相似之处在于,它们都是数据结构,通过它们的要求,导致各自树的高度最小化。这种“短度”允许在 O(log n) 时间内执行搜索,因为最大可能的读取次数对应于树的高度。
这是一棵AVL树,其核心是一棵二叉搜索树。然而,它是自平衡的,这意味着当您向树添加元素时,它会自行重组以尽可能保持高度的均匀性。基本上,它不会允许长分支。
B 树也可以做到这一点,但通过不同的重新平衡方案。写起来有点太复杂了,但是如果你在 Google 上搜索“B 树动画”,就会发现一些非常好的小程序可以很好地解释 B 树。
它们的不同之处在于,AVL 树是基于基于内存的解决方案来实现的,而 B 树是基于基于磁盘的解决方案来实现的。 AVL 树并不是为了保存大量数据而设计的,因为它们使用动态内存分配和指向下一个内存块的指针。显然,我们可以使用磁盘位置和磁盘指针来复制 AVL 树的功能,但速度会慢得多,因为我们仍然需要大量的读取来读取非常大的树。
当数据集合太大而无法放入内存时,解决方案是 B 树(有趣的事实:对于“B”实际代表什么还没有达成共识)。一棵 B 树在一个节点上保存有许多子节点,并且有许多指向子节点的指针。这样,在磁盘读取期间(读取单个磁盘块可能需要大约 10 毫秒),将返回最大量的相关节点数据以及指向“叶节点”磁盘块的指针。这使得数据的检索时间可以摊销到 log(n) 时间,使得 B 树对于数据库和大型数据集检索实现特别有用。
Both the AVL tree and the B-tree are similar in that they are data structures that, through their requirements, cause the height of their respective trees to be minimized. This "shortness" allows searching to be performed in O(log n) time, because the largest possible number of reads corresponds to the height of the tree.
This is an AVL tree, and is a binary search tree at its core. However, it is self-balancing, which means that as you add elements to the tree, it will restructure itself to maintain as uniform of a height as it can. Basically, it will not allow long branches.
A B-tree also does this, but through a different re-balancing scheme. It's a little too complicated to write out, but if you Google search "B-tree animation" there are some really good applets out there that explain a B-tree pretty well.
They are different in that an AVL tree is implemented with memory-based solutions in mind, while a B-tree is implemented with disk-based solutions in mind. AVL trees are not designed to hold massive collections of data, as they use dynamic memory allocation and pointers to the next block of memory. Obviously, we could replicate the AVL tree's functionality with disk locations and disk pointers, but it would be much slower because we would still have a significant number of reads to read a tree of a very large size.
When the data collection is so large that it doesn't fit in memory, the solution is a B-tree (interesting factoid: there is no consensus on what the "B" actually stands for). A B-tree holds many children at one node and many pointers to children node. This way, during a disk read (which can take around 10 ms to read a single disk block), the maximum amount of relevant node data is returned, as well as pointers to "leaf node" disk blocks. This allows retrieval time of data to be amortized to log(n) time, making the B-tree especially useful for database and large dataset retrieval implementations.
AVL 树是一种自平衡二叉搜索树,平衡以保持 O(log n) 高度。
B树是平衡树,但它不是二叉树。节点有更多的子节点,这会增加每个节点的搜索时间,但会减少搜索需要访问的节点数量。这使得它们非常适合基于磁盘的树。有关更多详细信息,请参阅维基百科文章。
An AVL tree is a self-balancing binary search tree, balanced to maintain O(log n) height.
A B-tree is a balanced tree, but it is not a binary tree. Nodes have more children, which increases per-node search time but decreases the number of nodes the search needs to visit. This makes them good for disk-based trees. For more details, see the Wikipedia article.
AVL 树是一种自平衡二叉树,它可以实现 O(lgN) 平均和最坏情况的搜索插入和删除操作。它用于内存支持的搜索树(中等大小的数据集)。
B 树主要用作非常大的数据集的存储支持的搜索树,因为它需要较少的磁盘读取(因为每个节点包含 N 个键,其中 N > 1)。 B 树被称为 (N,N+1) B 树,其中 N 是每个节点的键数,N+1 是每个节点的子节点数。每个节点的键越多,您需要从磁盘读取的次数就越少,并且它自然也会是一个更浅的树(更少的级别)。
An AVL tree is a self balancing binary tree which enable O(lgN) average and worst case for search insert and delete operations. It is used for in memory backed search trees (moderate sized datasets).
A B-Tree is primarily used as a storage backed search tree for very large datasets because it requires less reads to disk (since each node contains N keys where N >1). A B-Tree is said to be a (N,N+1) B-Tree where N is the number of keys per node and N+1 is the number of children per node. The more keys per node the less times you will need to read from disk and it will also naturally be a shallower tree (less levels).
其他回答者已经提供了有关 AVL 和 B 树的相当深入的技术细节,但我想添加有关这两者的相对新手信息:
Other answerers have already provided quite in-depth technical details about both AVL and B-Tree but I would like to add a relatively novice information regarding these two:
它们确实非常不同,尽管它们的目的大致相同:支持关联表。从历史上看,AVL 树在内存操作方面优于 B 树,但当访问内存比 CPU 周期便宜时尤其如此。
虽然 B 树通常用于数据库存储可变长度键,但 B 树对于固定长度和短记录(键 + 数据)表现最佳。对于此类用途,无论是在内存占用(因为它们存储数据更紧凑)还是速度(它们有更好的缓存局部性)方面,这可能会明显优于内存中使用的 AVL 树。
L2 是一个数据结构库,通过 B 树实现非常快速的关联表和序列。它还具有 AVL 树,可以轻松比较两者的性能。
They are very different indeed, though they serve largely the same purpose: supporting an associative table. Historically, the AVL tree were taken to outperform the B-tree for in-memory operations, but that hold especially true when accessing memory was cheap(er) when compared to CPU cycles.
Though generally used in databases store for variable length keys, the B-trees perform best for fixed length and short records (key + data). For such uses, that may significantly outperform the AVL trees for in-memory uses, both in terms of memory footprint (as they store data more compactly) and speed (they'd have lot better cache locality).
L2 is a data structures library implementing very fast associative tables and sequences over B-trees. It also has AVL trees, and making a comparison between the performance of two is easy.
B 树使用了上述所有思想。特别是 B 树:
此外,B 树通过确保内部节点至少半满来最大限度地减少浪费。 B 树可以处理任意数量的插入和删除。
The B-tree uses all of the ideas described above. In particular, a B-tree:
In addition, a B-tree minimizes waste by making sure the interior nodes are at least half full. A B-tree can handle an arbitrary number of insertions and deletions.
AVL 是自平衡的,保证所有操作在平均和最坏情况下都是 O(log n)。
AVL is self balancing, guaranteeing that all operations are O(log n) in average and worst cases.
B 树
这里的一个典型用例是磁盘上的数据库,在树中向下一级查找项目的成本很高。当你必须这样做时,你需要尽可能减少工作量。每个b树都有一个最大度,每个节点保存项目和对其他节点的引用(n个项目,n+1个引用)。这一事实使得在磁盘上存储和稍后查找项目块变得容易。它还允许您在添加/删除项目时避免重写树的大块。
AVL 树
一种自平衡平衡二叉树变体。平衡二叉树是最大高度为
log(n)
的一棵树。B-tree
A typical use-case here is with databases on disk, where it is expensive to go down a level in the tree to find an item. When you have to do it, you need to shave off as much work as possible. Each b-tree has a max degree and each node holds items and references to other nodes(n items, n+1 references). This fact makes it easy to store and later find chunks of items on disk. It also allows you to avoid re-writing large chunks of the tree when adding/ removing items.
AVL Tree
A self balancing balanced binary tree variety. A balanced binary tree is one with a max height of
log(n)
.树是编程中一种简单抽象(保护私有存储数据)的分层数据结构。它有具有父子关系的节点。根节点是没有父节点的节点,是树的起始节点。
二叉树是对树的修改,它遵循以下条件:
每个节点最多可以有 2 个子节点
二叉搜索树是二叉树的进一步修改。除了二叉树的条件外,父级和子级之间必须具有以下关系:
左子树 <= 父树 <= 右子树
AVL 树是 BST 的进一步修改。它必须具有以下性质:
Tree is a simple abstract(protects the private stored data) hierarchical data structure in programming. It has nodes with parent-child relationship. The root node is the one with no parent, and is the starting node of the tree.
A binary tree is a modification of the tree which follows the condition:
Each node can have at most 2 children
A binary search tree is a further modification of binary tree. Along with the condition of binary tree it must have the following relationship between the parent and children:
Left child <= Parent <= Right Child
An AVL tree is a further modification of BST. It must have the following properties: