mysql - 按索引列分组 +其中索引列导致速度下降
我有具有下一个结构的表statistics
:
+-------------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | YES | MUL | NULL | |
| year_in_tz | smallint(5) unsigned | YES | MUL | NULL | |
| month_in_tz | tinyint(3) unsigned | YES | MUL | NULL | |
+-------------------+----------------------+------+-----+---------+----------------+
在created_at、year_in_tz、month_in_tz和(year_in_tz、month_in_tz)上使用键:
ALTER TABLE `statistics` ADD INDEX created_at (created_at);
alter table statistics add index year_in_tz (year_in_tz);
alter table statistics add index month_in_tz (month_in_tz);
alter table statistics add index year_month_in_tz(year_in_tz,month_in_tz);
一些查询示例...
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (5.04 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | NULL | year_month_in_tz | 5 | NULL | 7797984 | Using index |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59')
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (1 min 33.46 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | created_at | year_month_in_tz | 5 | NULL | 7797984 | Using where |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.07 sec)
所以如果我在索引列+组上使用带有子句的where语句通过索引列,速度极低。 也许有人知道如何改进最后一个查询以使其更快?
PS 在使用索引后,我发现 (created_at,year_in_tz,month_in_tz) 上的新索引使查询运行得更快,但我希望每个查询 0-1 秒,而不是 10 秒:
alter table lending_statistics add index created_at_with_year_and_month_in_tz (created_at,year_in_tz,month_in_tz);
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | statistics | range | created_at,created_at_with_year_and_month_in_tz | created_at_with_year_and_month_in_tz | 9 | NULL | 3612208 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
集合中的 1 行 ( 0.05秒)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `lending_statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (10.62 sec)
I have table statistics
with next structure:
+-------------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | YES | MUL | NULL | |
| year_in_tz | smallint(5) unsigned | YES | MUL | NULL | |
| month_in_tz | tinyint(3) unsigned | YES | MUL | NULL | |
+-------------------+----------------------+------+-----+---------+----------------+
With keys on created_at, year_in_tz, month_in_tz and on (year_in_tz, month_in_tz):
ALTER TABLE `statistics` ADD INDEX created_at (created_at);
alter table statistics add index year_in_tz (year_in_tz);
alter table statistics add index month_in_tz (month_in_tz);
alter table statistics add index year_month_in_tz(year_in_tz,month_in_tz);
Some queries example...
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (5.04 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | NULL | year_month_in_tz | 5 | NULL | 7797984 | Using index |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59')
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (1 min 33.46 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | created_at | year_month_in_tz | 5 | NULL | 7797984 | Using where |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.07 sec)
So if I use where statement with clause on indexed column + group by indexed columns, speed is extremely low.
Maybe someone know how to improve last query to make it faster?
P.S. After playing with indexes, I found that new index on (created_at, year_in_tz, month_in_tz) made query run faster, but I want 0-1 seconds per query, not 10 seconds:
alter table lending_statistics add index created_at_with_year_and_month_in_tz (created_at,year_in_tz,month_in_tz);
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | statistics | range | created_at,created_at_with_year_and_month_in_tz | created_at_with_year_and_month_in_tz | 9 | NULL | 3612208 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
1 row in set (0.05 sec)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `lending_statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (10.62 sec)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
将字段 ID 添加到索引created_at_with_year_and_month_in_tz,然后更改您的 select 语句以使用
在 MySQL 5.6 中,ICP 功能可能会在这种情况下有所帮助,因为访问的所有字段都是索引的一部分。我相信当你指定count(*)时MySQL可能会读取实际的数据记录,因此它需要读取索引文件和数据文件。
Add the field ID to your index created_at_with_year_and_month_in_tz and then change your select statement to use
In MySQL 5.6 the ICP feature might help in this case cause all fields accessed are part of the index. I believe that MySQL might reads the actual data record when you specify count(*) hence it needs to read the index file as well as the datafile.
试试这个,日期时间索引有一个已知的 MySQL 问题
Try this, there is a known MySQL issue with datetime indexes
缓慢的
COUNT(*)
查询是 MySQL 和 MySQL 的常见问题。 PostgreSQL(和其他RDBMS),因为在查询执行期间执行顺序表扫描。尝试考虑将聚合数据缓存在其他地方:memcached、redis等Slow
COUNT(*)
queries is the often trouble of MySQL & PostgreSQL (and other RDBMS), because sequental table scan is performed during the query execution. Try to think about caching your aggregated data somewhere else: memcached, redis, etc.