MySQL 复合索引和运算符 BETWEEN

发布于 2024-10-06 02:00:39 字数 205 浏览 17 评论 0原文

我对此查询有一个问题：

SELECT * 
  FROM runs 
 WHERE (NOW() BETWEEN began_at 
                  AND finished_at)

您认为为 begin_at 和 finish_at 列创建复合索引有意义吗？或者只为 begin_at 创建索引才有意义？

原文

I have a question about this query:

SELECT * 
  FROM runs 
 WHERE (NOW() BETWEEN began_at 
                  AND finished_at)

Do you think it makes sense to create composite index for began_at and finished_at columns?
Or it makes sense to create index only for began_at?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

百合的盛世恋 2024-10-13 02:00:39

你的风格很不寻常。

大多数人可能会写 WHERE begin_at < NOW() 和完成时间 > NOW()

但是。我建议在这两个字段上都建立索引。

组合键对您没有用，因为它只会加快特定日期组合的搜索速度。

嗯，这并不完全正确，因为如果您使用 betree，组合键会对您有所帮助，但不如单独索引它们那么好。
如果您使用等号 (=) 运算符搜索字段组合，则组合键非常有用。单字段索引在 ragen 请求中表现更好。

你可以用谷歌搜索一下“多维范围搜索”。

原因是在btree中基本上可以在log(n)时间内找到一个字段中的所有匹配字段。
所以你的总运行时间将是 O(k*log(n))，即 O(log(n))。

多维范围查询的运行时间为 O(sqrt(n))，更高。然而，还有更好的实现，也可以实现对数运行时间。
然而它们并没有在 mysql 中完全实现，所以根据版本的不同，它们会更糟或更糟糕。

让我总结一下：

单个字段上的相等比较：哈希索引（运行时 O(1)）
单个字段上的范围搜索fields：单个字段的 btree 索引 ( O(log(n)) )
多个字段的相等搜索：组合哈希键（运行时 O(1)）

这些情况是很清楚的...

多个字段上的范围搜索：单独的 btree 索引（ O( log(n)) ）

这是不太清楚的地方。由于上述原因，在当前版本中，单独索引显然更好。
通过该用例的完美实现，您可以使用组合键获得更好的性能，但没有系统知道哪个支持它。
mysql 从 5.0 版本开始支持松散索引（您需要它），但非常有限，并且查询优化器仅在极少数情况下使用它们。不知道像5.3之类的新版本。

然而，随着 mysql 实现松散索引，在进行范围请求或按不同方向排序的字段上组合键变得越来越相关。

Your style is very uncommon.

Most people would probably write WHERE began_at < NOW() AND finished_at > NOW()

However. I would recommend putting an index on both fields.

A combined key wont be of use to you because you it would only speed up searcher for specific date combinations.

Well this is not entirely true because if you use betree a combined key will help you but not as good as if you index them seperately.
Combined keys are very good if you search combinations of fields with equality (=) operator. SIngle field indexes perform better in ragen requests.

You can google a bit for "multidimensional range search".

The reason is that all matching fields in one field can be basically found in log(n) time in btrees.
So your overall runtime will be O(k*log(n)) which is O(log(n)).

Multidimensional Range queries have a runtime of O(sqrt(n)) which is higher. However there are better implementations as well which also acheav logarithmic runtime.
However they are not fully implemented in mysql, so it will be worse or awful depending on the version.

So let me sum up:

Equality comparisions on single fields: hash index (runtime O(1))
Range search on single fields: btree index on single fields ( O(log(n)) )
Equality search on multiple fields: combined hash key (runtime O(1))

those cases are a clear thing...

Range search on multiple fields: seperate btree indexes ( O(log(n)) )

this is where its not so clear. with current versions its clearly better to index seperately because of the reasons given above.
With a perfect implementation for that use case you could achieve better performance with combined keys but there is no system in know of which supports it.
mysql supports loose indexes (which you need for that) since version 5.0, but only very limited and the query optimizer only utilizes them in rare cases afaik. don't know about newer versions like 5.3 or something.

however with mysql implementing loose indexes combined keys on fields where you do range requests or sorting in different directions become more and more relevant.

回复收藏 0 原文

池予 2024-10-13 02:00:39

由于使用了不等式，而不是等式，复合索引不会比两个单独的索引做得更好（如果不是更差）。

我主张在 began_at 和 finished_at 上倾向于两个单独的索引。

松散索引扫描参考：

http:// www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/

http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html

回复收藏 0 原文

放赐 2024-10-13 02:00:39

“索引合并”策略可以从 MySQL 5 开始发挥作用： http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html - 这也表明单独的索引可能会更好。

然而，我从来没能让它为我工作:)

回复收藏 0 原文

情深已缘浅 2024-10-13 02:00:39

此类问题的正确答案始终是：“视情况而定”。尝试两种方法并进行基准测试并比较执行计划。请注意，随着表中数据量的变化和查询工作负载的变化，答案也会发生变化。不断发展的系统上的索引几乎永远不会一劳永逸。

回复收藏 0 原文

囚我心虐我身 2024-10-13 02:00:39

好问题，但我实际上会从 began_at 和 finished_at 上的索引开始，因为您可以重写此查询的一种合理方式如下

SELECT *
  FROM runs
 WHERE began_at < NOW()
   AND finished_at > NOW()

这使得（对我来说）更清楚的是，每一列都需要自己的索引。

Good question, but I'd actually start with indexes on both began_at and finished_at, because one reasonable way you might rewrite this query is like so

SELECT *
  FROM runs
 WHERE began_at < NOW()
   AND finished_at > NOW()

Which makes it a bit more clear (to me) that each column needs its own index.

回复收藏 0 原文

沙沙粒小 2024-10-13 02:00:39

在尝试优化此类查询时，我从未让 MySQL 对 begin_at 或 finished_at 使用两个单独的索引。显然，其他人说复合索引也不起作用，因此可能无法让 MySQL 使用索引来优化此查询。

回复收藏 0 原文

~没有更多了~

关于作者

可可

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

MySQL 复合索引和运算符 BETWEEN

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

qq_VRzBBA45

痴情

。

Mu.

凉薄对峙

不落城

友情链接

MySQL 复合索引和运算符 BETWEEN

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

qq_VRzBBA45

痴情

。

Mu.

凉薄对峙

不落城

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。