如何在大表上快速计数?
我有包含数十万行的大型 MySQL 表。
我需要在客户表上编写一个查询,以获取客户可以再次联系的时间。
例如。
SELECT 'This week', COUNT(*) FROM customers
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7
UNION
SELECT 'Next week', COUNT(*) FROM customers
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7
AND DATEDIFF(NOW(), available_date) < 14
UNION
... (a few more like this)
在不同的大型表上编写类似的查询后,我确实注意到将引擎从 InnoDB 更改为 MyISAM 大大加快了查询速度(这些表不需要 InnoDB,因为它们没有外键检查)。我还能做些什么来加快这样的计数(除了为适当的字段建立索引之外)吗?
I have large MySQL tables with hundreds of thousands of rows.
I need to write a query on a customers table which gets the count of when customers will be available to contact again.
eg.
SELECT 'This week', COUNT(*) FROM customers
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7
UNION
SELECT 'Next week', COUNT(*) FROM customers
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7
AND DATEDIFF(NOW(), available_date) < 14
UNION
... (a few more like this)
Having written a similar query on a different large table, I did notice that changing the engine from InnoDB to MyISAM sped up the query considerably (InnoDB is not needed for these tables as they do not have foreign key checks). Is there anything else I can do to speed up counts like this (other than indexing the appropriate fields)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()
这样做应该让 MySQL 使用在
(sales_person_id, available_date)
上创建的复合索引列(使用 EXPLAIN 检查)WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()
Doing it this way should let MySQL use a composite index created on
(sales_person_id, available_date)
columns (use EXPLAIN to check)永远不要在多个查询中执行您可以在一个查询中执行的操作。
如果您创建具有必要的开始和结束的派生表/内联视图,结束日期,这可用于使用 GROUP BY 在单个查询中生成所需的结果。 MySQL 没有递归函数,因此您必须使用 NUMBERS 表技巧来生成日期...
创建一个仅保存递增数字的表 - 使用 auto_increment 很容易做到:
使用以下方式填充表格:
...您需要的任意数量的值。
使用 DATE_ADD 构建日期列表,根据 NUMBERS.id 值增加天数。
根据日期时间部分连接到数据表:
<前><代码> SELECT x.start_dt,
x.end_dt,
COUNT(*) AS 数字
FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
DATE_ADD(NOW(), INTERVAL n.id + 6 天) AS end_dt
来自“数字”n
WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
在 x.start_dt 之间的 c.available_date 上加入客户 c
和 x.end_dt
按 x.start_dt、x.end_dt 分组
不要使用对实际列数据执行的函数 - IE:
DATEDIFF(NOW(), *available_date*)< /code> - 因为数据库无法在
available_date
列上使用索引(如果存在),因为数据已更改为远离索引值。Never do in multiple queries, what you can do in one.
If you create a derived table/inline view that has the necessary start & end dates, this can be used to produce the desired result in a single query using a GROUP BY. MySQL doesn't have a recursive function, so you have to use the NUMBERS table trick to generate dates...
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
Populate the table using:
...for as many values as you need.
Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value.
JOIN onto your table of data based on the datetime portion:
Don't use functions performed upon actual column data - IE:
DATEDIFF(NOW(), *available_date*)
- because a database can't use an index (if one exists) on theavailable_date
column because the data has been altered away from the index value.重点关注 WHERE 子句。
Focus on the WHERE clause.