B 树与二叉树

发布于 2024-11-11 15:33:50 字数 184 浏览 6 评论 0原文

如果我使用 b 树实现内存（RAM）搜索操作，那么与二叉树相比，它在缓存或其他一些效果方面会更好吗？

我所知道的是——

binary search tress---O(log n)
btrees ---------------O(c log n)

在各种博客上对此有很多讨论。

原文

If I am implementing in-memory(RAM) search operation with b trees, then would it be better in terms of caching or some other effects when compared with binary trees?

What I know is-

binary search tress---O(log n)
btrees ---------------O(c log n)

there was a lot of discussion regarding that on various blogs.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浅忆 2024-11-18 15:33:50

算法复杂度是相同的，因为 O(log_b n) = O(c log n) = O(log n) 但隐藏在大 O 表示法中的常数因子可能会有显着变化，取决于实现和硬件。

B 树是为盘片硬盘设计的，它具有较长的访问时间（将磁头移动到位），然后读取整个物理扇区。使 B 树节点与扇区一样大可以最大限度地减少访问次数，并最大限度地提高每次读取操作的有用数据。

但是，如果您的内存不足，则访问时间可以忽略不计，因此更好的比较是计算算法访问的单个单词的数量。

例如，让我们规划一个数据结构来存储 2²⁰ 个键，每个键 1 个字，在 32 位机器上总共 4MiB 的原始数据。

一棵二叉搜索树有 2²⁰ 个节点，每个节点包含一个键和两个指针（3 个单词）。深度将为 log₂(2²⁰) = 20。平均搜索必须从其路径中的每个节点（从根开始）读取键和指针之一一路向下 = 40 个字。

为硬盘制作的 B 树将有 4kB 节点。每个节点可以在内部存储为键和指针对的排序数组，数量在 256 到 512 之间。平均搜索结果是什么样的？考虑到平均 3/4 填充，每个节点将包含 384 个条目，其内部二分搜索将必须平均访问 log₂(384) = 5.95 个键。平均深度为 log₃₈₄(2²⁰) = 2.33，因此我们的搜索平均需要读取 2.33 次 5.95 个键，或者大约14 个单词< /强>。

在低扇出（分支因子）B 树的情况下，每个节点包含 16 到 32 个键，平均填充将为 24 个键，平均深度 log₂₄(2²⁰) = 4.36，每个节点中的二分查找将进行 log₂(24) = 4.58 次比较，总体平均搜索将需要读取大约 20 个单词< /强>。

请记住，最后两个数据结构比二叉树获得了更好的结果，因为它们优化了读取操作而不是修改。要将键插入其中一棵 B 树中，您必须平均重写整个 384 字或 24 字节点（如果不超过一个），而在二叉树情况下，写入操作仍然只需要最多触摸 40 个单词。

（之前我错了。感谢 @virco 和 @Groo 在评论中指出我的错误。）

无论如何，似乎只有内存的 B 树具有低扇出在实践中似乎比二叉树表现更好。

特别是每个节点 32 个密钥似乎是当前架构（32 位和 64 位）的最佳选择。许多较新的语言和库正在使用 32 键 B 树作为内置数据结构，与哈希表和数组一起使用或作为哈希表和数组的替代品。这种用法由 Clojure 和其他函数式语言带头，但随后被更主流的语言（例如 Javascript）所采用，最近关注不可变数据结构（例如Immutable.js）

这个结果不仅可以解释通过计算从内存中读取的字数，但高速缓存也会丢失，这是导致 CPU 停止并等待 RAM 的读取操作。如果缓存架构可以一次获取包含整个 B 树节点的 RAM 块，我们就可以获得与基于磁盘的大容量存储成功使用的相同优化。

在硬盘优化数据结构的情况下，我们将使用节点与物理磁盘扇区一样大的 B 树，以最大限度地减少磁盘访问时间。在本例中，我们使用 B 树，其节点与 3 级缓存针对 RAM 执行的读取操作一样大，以最大限度地减少缓存未命中。

Algorithmic complexity is the same, since O(log_b n) = O(c log n) = O(log n) but the constant factors, which are hidden in big-O notation, could vary noticeably, depending on implementation and hardware.

B-trees were designed for platter hard disks, which have a large access time (moving the head into position) after which an entire physical sector is read. Making the B-tree nodes as large as the sector minimizes the number of access times and maximizes the useful data out of each read operation.

But if you are working out of memory you have a negligible access time, therefore a better comparison is to count the number of single words accessed by your algorithm.

For example, let's plan a data structure to store 2²⁰ keys of 1 word each, for a total of 4MiB of raw data on a 32bit machine.

A binary search tree will have 2²⁰ nodes, each holding one key and two pointers (3 words). Depth will be log₂(2²⁰) = 20. The average search will have to read the key and one of the pointers from each node in its path, from the root all the way down = 40 words.

A B-tree made for hard disks will have 4kB nodes. Each node could be stored internally as a sorted array of key and pointer couples, between 256 and 512 of them. What is the average search going to look like? Considering an average 3/4 fill, each node will contain 384 entries, and its internal binary search will have to visit on average log₂(384) = 5.95 keys. The average depth will be log₃₈₄(2²⁰) = 2.33, so our search will have to read on average 2.33 times 5.95 keys, or about 14 words.

In the case of a low-fanout (branching factor) B-tree, with each node holding between 16 and 32 keys, the average fill will be 24 keys, the average depth log₂₄(2²⁰) = 4.36, the binary search in each node will make log₂(24) = 4.58 comparisons, and the overall average search will have to read about 20 words.

Keep in mind that the last two data structures achieve a better result than binary trees because they optimize read operations over modifications. To insert a key into one of these B-trees you would have to rewrite on average an entire 384-word or 24-word node, if not more than one, while in the binary-tree case a write operation would still only need to touch up to 40 words.

(Previously I had it wrong. Thanks to @virco and @Groo for pointing out my mistake in the comments.)

In any case, it seems that memory-only B-trees with a low fanout appear to perform better than binary trees in practice.

32 keys per node in particular seems to be a sweet spot for current architectures, both 32- and 64-bit. Many newer languages and libraries are using 32-key B-trees as a built-in data structure, alongside or as a replacement for hash tables and arrays. This usage was spearheaded by Clojure and other functional languages, but was subsequently taken up by more mainstream languages such as Javascript, with the recent focus on immutable data structures (eg. Immutable.js)

This result may be explained not only by counting the number of words being read from memory, but the cache misses too, which are read operations that cause the CPU to stall and wait for the RAM. If the caching architecture can fetch chunks of RAM that contain an entire B-tree node at a time, we get the same optimization that has been used sucessfully for disk-based mass storage.

In the case of hard-disk optimized data structures, we would use B-trees with nodes as large as the physical disk sector, to minimize disk access times. In this case, we are using B-trees with nodes as large as the read operation that is performed by the Level 3 cache against the RAM, to minimize cache misses.

回复收藏 0 原文