二分查找的最优性

发布于 2024-10-10 01:27:28 字数 203 浏览 6 评论 0原文

这可能是一个愚蠢的问题,但是有人知道二分搜索是渐近最优的证明吗?也就是说,如果给我们一个排序的元素列表,其中对这些对象唯一允许的操作是比较,那么如何证明搜索不能在 o(lg n) 中完成? (顺便说一句,这是 lg n 的小-o。)请注意,我将其限制为唯一允许的操作是比较的元素,因为有一些众所周知的算法可以在预期上击败 O(lg n)如果允许您对数据执行更复杂的操作(例如,参见插值搜索)。

This may be a silly question, but does anyone know of a proof that binary search is asymptotically optimal? That is, if we are given a sorted list of elements where the only permitted operation on those objects is a comparison, how do you prove that the search can't be done in o(lg n)? (That's little-o of lg n, by the way.) Note that I'm restricting this to elements where the only operation permitted operation is a comparison, since there are well-known algorithms that can beat O(lg n) on expectation if you're allowed to do more complex operations on the data (see, for example, interpolation search).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

酒浓于脸红 2024-10-17 01:27:28

来自此处

  • 可能结果的数量至少应为在)。
  • 您可以通过“决策树”的节点来表示算法所做的决策:如果项目比较大,则继续前进,如果不是,则继续前进。树的节点是算法的状态,叶子是结果。所以树中至少应该有 O(n) 个节点。
  • O(n) 个节点上的每棵树至少有 O(log N) 个级别。

From here:

  • The number of possible outcomes should be at least O(n).
  • You can represent the decisions done by the algorithm by nodes of a "decision tree": if the items compare greater then it goes on way, if not it goes the other way. The nodes of the tree are the states of the algorithm and the leaves are the outcomes. So there should be at least O(n) nodes in the tree.
  • Each tree on O(n) nodes has at least O(log N) levels.
北方的巷 2024-10-17 01:27:28

逻辑很简单。假设我们有 n 个不同排序元素的数组。

  1. 1 次比较有 2 种可能的结果(第一个元素较小或较大)。因此,如果一次比较就足够了,n <= 2。否则,如果我们有 3 个元素 (a, b, c),并且我们的算法有 2 个可能的结果,则永远不会选择 3 个元素之一。
  2. 2 次比较有 4 种可能的结果。因此,如果 2 次比较就足够了,n <= 4
  3. 同样,为了使 k 比较足够,n 应该是 n <= 2^k

将最后一个不等式反转即可得到对数:k >= log(2, n)

The logic is simple. Let's assume we have array of n different sorted elements.

  1. There're 2 possible outcomes of 1 comparison (first element is smaller or greater). Thus, if one comparison is enough, n <= 2. Otherwise, if we have 3 elements (a, b, c) and our algorithm has 2 possible outcomes, one of 3 elements will never be selected.
  2. There're 4 possible outcomes of 2 comparisons. Thus, if 2 comparisons is enough, n <= 4.
  3. Likewise, for k comparisons to be enough n should be n <= 2^k.

Reverse the last inequality and you'll have logarithm: k >= log(2, n).

夜未央樱花落 2024-10-17 01:27:28

正如 Nikita 所描述的,不可能有比 O(log n) 更好的东西总是

即使您允许执行一些额外的操作,但这仍然不够 - 我确信可以准备元素序列,其中插值搜索的性能比二分搜索更差。

我们可以说插值搜索更好,只是因为:

  1. 我们考虑平均性能,而不是最坏情况。
  2. 每个可能的传入数据集的概率是不均匀的。

所以答案是——这完全取决于我们对传入数据集的额外了解。

As Nikita described it's impossible to have something always better than O(log n).

Even if you allowed to do some additional operations, it's still not enough - I'm sure it's possible to prepare elements sequence where interpolation search will perform worse than binary search.

We can say interpolation search is better only because:

  1. We consider average performance, not worst case.
  2. Probability of each possible incoming data set is non-uniform.

So the answer is - it all depends on the additional knowledge we have about incoming data sets.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文