二分搜索与二分搜索树

发布于 2024-11-06 18:28:10 字数 442 浏览 1 评论 0原文

与使用二分搜索的排序数组相比，二分搜索树有什么好处？仅通过数学分析，我没有看到差异，因此我认为低级实现开销一定存在差异。平均案例运行时间的分析如下所示。

使用二分查找排序数组
搜索：O(log(n))
插入：O(log(n))（我们运行二分搜索来查找插入元素的位置）
删除：O(log(n))（我们运行二分搜索来查找要删除的元素）

二叉搜索树
搜索：O(log(n))
插入：O(log(n))
删除：O(log(n))

对于上面列出的操作，二叉搜索树的最坏情况是 O(n)（如果树不平衡），因此这看起来实际上比二分搜索的排序数组更糟糕。

另外，我并不是假设我们必须事先对数组进行排序（这将花费 O(nlog(n))），我们会将元素一一插入到数组中，就像我们对二叉树所做的那样。唯一的好处我可以看到 BST 的特点是它支持其他类型的遍历，如中序、前序、后序。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

满栀 2024-11-13 18:28:10

您的分析是错误的，对于排序数组来说，插入和删除都是 O(n)，因为您必须物理地移动数据来为插入腾出空间或压缩它以覆盖已删除的项目。

哦，完全不平衡二叉搜索树的最坏情况是 O(n)，而不是 O(logn)。

回复收藏 0 原文

琴流音 2024-11-13 18:28:10

查询任何一个都没有多大好处。查询。

但是，当您一次添加一个元素时，构造一棵排序树比构造一个排序数组要快得多。因此，完成后将其转换为数组是没有意义的。

回复收藏 0 原文

ι不睡觉的鱼゛ 2024-11-13 18:28:10

另请注意，有用于维护平衡二叉搜索树的标准算法。他们消除了二叉树的缺陷并保留了所有其他优点。不过，它们很复杂，因此您应该首先了解二叉树。

除此之外，大 O 可能是相同的，但常数并不总是相同。使用二叉树，如果正确存储数据，则可以很好地利用多个级别的缓存。结果是，如果您正在进行大量查询，大部分工作都会保留在 CPU 缓存中，这会大大加快速度。如果您小心地构建树，则尤其如此。请参阅 http://blogs。 msdn.com/b/devdev/archive/2007/06/12/cache-oblivious-data-structs.aspx 示例展示了巧妙的树布局如何极大地提高性能。您执行二分搜索的数组不允许使用任何此类技巧。

回复收藏 0 原文

乙白 2024-11-13 18:28:10

添加到@Blindy，我会说排序数组中的插入比CPU指令O(logn)需要更多的内存操作O(n) std::rotate()，请参阅插入排序。

    std::vector<MYINTTYPE> sorted_array;

    // ... ...

    // insert x at the end
    sorted_array.push_back(x);

    auto& begin = sorted_array.begin();

    // O(log n) CPU operation
    auto& insertion_point = std::lower_bound(begin()
             , begin()+sorted_array().size()-1, x); 
    
    // O(n) memory operation
    std::rotate(begin, insertion_point, sorted_array.end());

我猜左子右兄弟树结合了二叉树和排序数组的本质。

数据结构	操作	CPU 成本	内存操作成本
排序数组	插入	O(logn) （受益于流水线）	O(n) 内存操作，请参阅使用 `std::rotate() 进行插入排序`
	搜索	O(logn)	受益于内联实现
	删除	O(logn) （当使用内存操作进行流水线操作时）	O(n) 内存操作，请参阅 std::vector::erase()
平衡二叉树	插入	O(logn)（影响流水线的分支预测的缺点，还增加了树旋转的成本）	耗尽缓存的指针的额外成本。
	搜索	O(logn)
	删除	O(logn) （与插入相同）
左子右兄弟树（组合排序数组和二叉树）	插入	平均	则在左子上插入时不需要 `std::rotate()`
	O(logn) 如果保持不平衡搜索	O(logn)，最坏情况 O(n)（不平衡时）	利用右兄弟搜索中的缓存局部性，请参阅 std::vector::lower_bound()
	删除	O(logn) （超线程/流水线时）	O(n) 内存操作参考 `std::vector::erase()`

Adding to @Blindy , I would say the insertion in sorted array takes more of memory operation O(n) std::rotate() than CPU instruction O(logn), refer to insertion sort.

    std::vector<MYINTTYPE> sorted_array;

    // ... ...

    // insert x at the end
    sorted_array.push_back(x);

    auto& begin = sorted_array.begin();

    // O(log n) CPU operation
    auto& insertion_point = std::lower_bound(begin()
             , begin()+sorted_array().size()-1, x); 
    
    // O(n) memory operation
    std::rotate(begin, insertion_point, sorted_array.end());

I guess Left child right sibling tree combines the essence of binary tree and sorted array.

data structure	operation	CPU cost	Memory operation cost
sorted array	insert	O(logn) (benefits from pipelining)	O(n) memory operation, refer to insertion-sort using `std::rotate()`
	search	O(logn)	benefits from inline implementation
	delete	O(logn) (when pipelining with memory operation)	O(n) memory operation, refer to `std::vector::erase()`
balanced binary tree	insert	O(logn) (drawback of branch-prediction affecting pipelining, also added cost of tree rotation)	Additional cost of pointers that exhaust the cache.
	search	O(logn)
	delete	O(logn) (same as insert)
Left child right sibling tree (combines sorted array and binary tree)	insert	O(logn) on average	No need `std::rotate()` when inserting on left child if kept unbalanced
	search	O(logn) (in worst case O(n) when unbalanced)	takes advantage of cache locality in right sibling search , refer to std::vector::lower_bound()
	delete	O(logn) (when hyperthreading/pipelining)	O(n) memory operation refer to `std::vector::erase()`