一般来说,MySQL 或 SQL 中的 BETWEEN 和 IN 之间是否存在性能差异?

发布于 2024-09-11 02:59:25 字数 224 浏览 1 评论 0原文

我想根据主键获取一组连续的行,主键是一个自动递增的整数。假设没有漏洞,在:

SELECT * FROM `theTable` WHERE `id` IN (n, ... nk); 

和:之间是否有任何性能:

SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;

I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:

SELECT * FROM `theTable` WHERE `id` IN (n, ... nk); 

and:

SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

逆光下的微笑 2024-09-18 02:59:25

在这种情况下,BETWEEN 应该优于IN(但是也要测量和检查执行计划!),特别是当n 不断增长,并且统计数据仍然准确。假设:

  • m 是表的大小
  • 范围的大小

n 是可以使用索引 (nn 相比很小>m)

  • 理论上,BETWEEN可以通过对主键索引进行一次“范围扫描”(Oracle的说法)来实现,然后最多遍历n 索引叶节点。复杂度为 O(n + log m)

  • IN通常实现为一系列(循环)n“范围扫描" 在主键索引上。由于 m 是表的大小,复杂度始终是 O(n * log m) ...这总是更糟(对于非常小的表,neglibile m 或非常小的范围 n)

无法使用索引(nm 的重要部分)

在任何情况下,您将获得全表扫描并评估每一行的谓词:

  • BETWEEN 需要评估两个谓词:一个用于下限,一个用于上限。复杂度为 O(m)

  • IN 最多需要评估 n 个谓词。复杂度为 O(m * n) ...这又总是更糟糕,或者如果数据库可以优化 IN ,则可能是 O(m) code> 列表是一个哈希图,而不是谓词列表。

BETWEEN should outperform IN in this case (but do measure and check execution plans, too!), especially as n grows and as statistics are still accurate. Let's assume:

  • m is the size of your table
  • n is the size of your range

Index can be used (n is tiny compared to m)

  • In theory, BETWEEN can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n index leaf nodes. The complexity will be O(n + log m)

  • IN is usually implemented as a series (loop) of n "range scans" on the primary key index. With m being the size of the table, the complexity will always be O(n * log m) ... which is always worse (neglibile for very small tables m or very small ranges n)

Index cannot be used (n is a significant portion of m)

In any case, you'll get a full table scan and evaluate the predicate on each row:

  • BETWEEN needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)

  • IN needs to evaluate at most n predicates. The complexity is O(m * n) ... which is again always worse, or perhaps O(m) if the database can optimise the IN list to be a hashmap, rather than a list of predicates.

怼怹恏 2024-09-18 02:59:25

b 和 c 之间的 a 是一个扩展为 b <= a 和 a <= c 的宏。

a in (b,c,d) 是一个扩展为 a=b 或 a=c 或 a=d 的宏。

假设您的 nnk 是整数,两者最终的含义应该相同。 after 变体应该快得多,因为它只有两次比较,而 nk - n 则比较 in 变体。

a between b and c is a macro that expands to b <= a and a <= c.

a in (b,c,d) is a macro that expands to a=b or a=c or a=d.

Assuming your n and nk are integer, both should end up meaning the same. The between variant should be much faster because it's only two compares, versus nk - n compares for the in variant.

少女净妖师 2024-09-18 02:59:25

对于这个问题我曾经做过研究。
我的表中有 11M 行。我对此执行了两个查询:

查询 1:SELECT * FROM PLAYERS WHERE SCORE BETWEEN 10 TO 20

查询 2:SELECT * FROM PLAYERS WHERE SCORE IN (10,11,..., 20)

在执行时,两个查询都被翻译为上面所说的 Andomar

在这两个查询中,查询 1 的运行速度比查询 2 快。

要了解更多信息,请点击以下链接:

MySQL 中 BETWEEN VS IN() 的性能

谢谢。

I have done research for this question.
I have 11M rows in my table. I have executed two queries on that:

Query 1:SELECT * FROM PLAYERS WHERE SCORE BETWEEN 10 TO 20

Query 2:SELECT * FROM PLAYERS WHERE SCORE IN (10,11,...,20)

While execution time, both queries are translated as Andomar said above.

Among both queries, Query 1 is running faster than Query 2.

To know more follow this link:

Performance of BETWEEN VS IN() in MySQL

Thank you.

相思碎 2024-09-18 02:59:25

在许多数据库服务器中,IN() 只是多个 OR 子句的同义词,因为两者在逻辑上是等效的。 MySQL 中则不然,它对 IN() 列表中的值进行排序,并使用快速二分搜索来查看某个值是否在列表中。列表的大小为 O(Log n),而等效的一系列 OR 子句的列表大小为 O(n)(即,对于大型列表来说要慢得多)

In many database servers, IN() is just a synonym for multiple OR clauses, because the two are logically equivalent. Not so in MySQL, which sorts the values in the IN() list and uses a fast binary search to see whether a value is in the list. This is O(Log n) in the size of the list, whereas an equivalent series of OR clauses is O(n) in the size of the list (i.e., much slower for large lists)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文