如何在 B 树上使用隐式 OR 查询?
我想使用b树作为索引,但我想不出OR查询的解决方案。
对于 OR 查询,我的意思是 select * from table where id 介于 1 和 5 之间或 id 介于 10 和 15 之间;
如果我使用 id 作为 b 树中的键,那么如何在 b 树上进行像上面这样的查询?
当搜索b树时,假设小于6和大于6的key在不同的子树上,而不是当搜索路径经过包含小于6的key的子树时,id 1到5之间的id可以检索,但是10到15之间的id呢?
我是否必须使用b+树,当我找到指向 id 1 的键时,我只需一一扫描叶节点,直到找到指向 id 15 的键? 对于这种查询来说,这是一个不好的解决方案吗: 从表中选择 *,其中 id 介于 1 和 5 之间或 id 介于 10000000 和 10000005 之间???
或者还有其他解决方案吗?
非常感谢!
I want to use b-tree for index, but I can't think out an solution for OR query.
For OR query, I mean something like
select * from table where id between 1 and 5 OR id between 10 and 15;
if I use id as the key in the b-tree, than how can I do query like above on the b-tree?
when search through the b-tree, assume that the key that are smaller than 6 and bigger than 6 is on different sub-trees, than when the search path go through the sub-tree that contaions the key that are smaller than 6, id that between 1 and 5 can be retrived, but what about id that between 10 and 15?
Do I have to use the b+tree,and when I found the key which points to id 1 , I just keep scan through the leaf nodes one by one until I found the key which points to id 15?
Is it bad solution for this kind of query:
select * from table where id between 1 and 5 OR id between 10000000 and 10000005???
Or is there any other solutions?
Thank you very much!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
OR 运算意味着需要进行两次搜索,并将结果合并。
An OR operation implies that two searches need to be done, and the results combined.
OR 关键字是一个常见问题。从索引的角度来看,通常最好进行两次查找(例如,像 UNION 一样)。
然而,也有例外。您的第一个示例(id 介于 1 和 5 之间或 id 介于 10 和 15 之间)可能最好在从 1 到 15 的一次索引查找中完成,丢弃值 6-9。不过,这取决于数据量!您的第二个示例(1 到 5 之间或 id 10000000 到 10000005 之间)看起来不是该方法的良好候选者。但是,它取决于行数,而不取决于 id 的数量。
关于AND:你的例子是一个矛盾(id在1和5之间AND id在10000000和10000005之间),查询将不会返回任何行。一些优化器能够“看到”这一点。
不同列上的AND条件需要通过串联索引来解决。
请参阅我的网络书 使用索引,Luke! 了解更多详细信息。
The OR keyword is a common problem. From index perspective it is usually best to do two lookups (e.g., like a UNION).
However, exceptions exist. Your first example (id between 1 and 5 OR id between 10 and 15) might be best done in one index lookup from 1 to 15, discarding the values 6-9. However, that depends on the data volume! You second example (between 1 and 5 OR id between 10000000 and 10000005) doesn't look to be a good candidate for that approach. However, it depends on the number of rows, not on the number of the id's.
Regarding AND: your example is a contradiction (id between 1 and 5 AND id between 10000000 and 10000005), the query will not return any rows. Some optimizers are able to "see" that.
AND conditions on different columns are to be solved with concatenated indexes.
Have a look at my Web-Book Use The Index, Luke! for further details.