MySQL 复合索引和运算符 BETWEEN
我对此查询有一个问题:
SELECT *
FROM runs
WHERE (NOW() BETWEEN began_at
AND finished_at)
您认为为 begin_at 和 finish_at 列创建复合索引有意义吗? 或者只为 begin_at 创建索引才有意义?
I have a question about this query:
SELECT *
FROM runs
WHERE (NOW() BETWEEN began_at
AND finished_at)
Do you think it makes sense to create composite index for began_at and finished_at columns?
Or it makes sense to create index only for began_at?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
你的风格很不寻常。
大多数人可能会写
WHERE begin_at < NOW() 和完成时间 > NOW()
但是。我建议在这两个字段上都建立索引。
组合键对您没有用,因为它只会加快特定日期组合的搜索速度。
嗯,这并不完全正确,因为如果您使用 betree,组合键会对您有所帮助,但不如单独索引它们那么好。
如果您使用等号 (=) 运算符搜索字段组合,则组合键非常有用。单字段索引在 ragen 请求中表现更好。
你可以用谷歌搜索一下“多维范围搜索”。
原因是在btree中基本上可以在log(n)时间内找到一个字段中的所有匹配字段。
所以你的总运行时间将是 O(k*log(n)),即 O(log(n))。
多维范围查询的运行时间为 O(sqrt(n)),更高。然而,还有更好的实现,也可以实现对数运行时间。
然而它们并没有在 mysql 中完全实现,所以根据版本的不同,它们会更糟或更糟糕。
让我总结一下:
单个字段上的相等比较:哈希索引(运行时 O(1))
单个字段上的范围搜索fields:单个字段的 btree 索引 ( O(log(n)) )
多个字段的相等搜索:组合哈希键(运行时 O(1))
这些情况是很清楚的...
这是不太清楚的地方。由于上述原因,在当前版本中,单独索引显然更好。
通过该用例的完美实现,您可以使用组合键获得更好的性能,但没有系统知道哪个支持它。
mysql 从 5.0 版本开始支持松散索引(您需要它),但非常有限,并且查询优化器仅在极少数情况下使用它们。不知道像5.3之类的新版本。
然而,随着 mysql 实现松散索引,在进行范围请求或按不同方向排序的字段上组合键变得越来越相关。
Your style is very uncommon.
Most people would probably write
WHERE began_at < NOW() AND finished_at > NOW()
However. I would recommend putting an index on both fields.
A combined key wont be of use to you because you it would only speed up searcher for specific date combinations.
Well this is not entirely true because if you use betree a combined key will help you but not as good as if you index them seperately.
Combined keys are very good if you search combinations of fields with equality (=) operator. SIngle field indexes perform better in ragen requests.
You can google a bit for "multidimensional range search".
The reason is that all matching fields in one field can be basically found in log(n) time in btrees.
So your overall runtime will be O(k*log(n)) which is O(log(n)).
Multidimensional Range queries have a runtime of O(sqrt(n)) which is higher. However there are better implementations as well which also acheav logarithmic runtime.
However they are not fully implemented in mysql, so it will be worse or awful depending on the version.
So let me sum up:
Equality comparisions on single fields: hash index (runtime O(1))
Range search on single fields: btree index on single fields ( O(log(n)) )
Equality search on multiple fields: combined hash key (runtime O(1))
those cases are a clear thing...
this is where its not so clear. with current versions its clearly better to index seperately because of the reasons given above.
With a perfect implementation for that use case you could achieve better performance with combined keys but there is no system in know of which supports it.
mysql supports loose indexes (which you need for that) since version 5.0, but only very limited and the query optimizer only utilizes them in rare cases afaik. don't know about newer versions like 5.3 or something.
however with mysql implementing loose indexes combined keys on fields where you do range requests or sorting in different directions become more and more relevant.
由于使用了不等式,而不是等式,复合索引不会比两个单独的索引做得更好(如果不是更差)。
我主张在
began_at
和finished_at
上倾向于两个单独的索引。松散索引扫描参考:
http:// www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/
http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html
Due to the use of inequalities, and not equalities, a composite index isn't going to do any much better (if not worse) than two individual indexes.
I'd advocate for leaning towards two individual indexes on both
began_at
andfinished_at
.References for Loose index scan:
http://www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/
http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html
“索引合并”策略可以从 MySQL 5 开始发挥作用: http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html - 这也表明单独的索引可能会更好。
然而,我从来没能让它为我工作:)
The "Index Merge" strategy could come into play from MySQL 5 onwards: http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html - which also suggests that separate indexes might be better.
However, I have never been able to get it to work for me :)
此类问题的正确答案始终是:“视情况而定”。尝试两种方法并进行基准测试并比较执行计划。请注意,随着表中数据量的变化和查询工作负载的变化,答案也会发生变化。不断发展的系统上的索引几乎永远不会一劳永逸。
The correct answer to this sort of question is always: "it depends". Try it both ways and benchmark and compare execution plans. Note that the answer also changes as the amount of data in your tables changes and the query workload changes. Indexes on an evolving system are almost never fire and forget.
好问题,但我实际上会从 began_at 和
finished_at
上的索引开始,因为您可以重写此查询的一种合理方式如下这使得(对我来说)更清楚的是,每一列都需要自己的索引。
Good question, but I'd actually start with indexes on both
began_at
andfinished_at
, because one reasonable way you might rewrite this query is like soWhich makes it a bit more clear (to me) that each column needs its own index.
在尝试优化此类查询时,我从未让 MySQL 对
begin_at
或finished_at
使用两个单独的索引。显然,其他人说复合索引也不起作用,因此可能无法让 MySQL 使用索引来优化此查询。I've never gotten MySQL to use use two individual indices for
begin_at
orfinished_at
when trying to optimize this kind of query. Apparently, other people are saying that a composite index won't work either, so it may just not be possible to get MySQL to optimize this query using an index.