如何在大表上快速计数？

发布于 2024-09-26 09:22:57 字数 549 浏览 0 评论 0原文

我有包含数十万行的大型 MySQL 表。

我需要在客户表上编写一个查询，以获取客户可以再次联系的时间。

例如。

SELECT 'This week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7

UNION

SELECT 'Next week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7 
    AND DATEDIFF(NOW(), available_date) < 14

UNION

... (a few more like this)

在不同的大型表上编写类似的查询后，我确实注意到将引擎从 InnoDB 更改为 MyISAM 大大加快了查询速度（这些表不需要 InnoDB，因为它们没有外键检查）。我还能做些什么来加快这样的计数（除了为适当的字段建立索引之外）吗？

原文

I have large MySQL tables with hundreds of thousands of rows.

I need to write a query on a customers table which gets the count of when customers will be available to contact again.

eg.

SELECT 'This week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7

UNION

SELECT 'Next week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7 
    AND DATEDIFF(NOW(), available_date) < 14

UNION

... (a few more like this)

Having written a similar query on a different large table, I did notice that changing the engine from InnoDB to MyISAM sped up the query considerably (InnoDB is not needed for these tables as they do not have foreign key checks). Is there anything else I can do to speed up counts like this (other than indexing the appropriate fields)?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

生来就爱笑 2024-10-03 09:22:57

WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()

这样做应该让 MySQL 使用在 (sales_person_id, available_date) 上创建的复合索引列（使用 EXPLAIN 检查）

回复收藏 0 原文

夏了南城 2024-10-03 09:22:57

永远不要在多个查询中执行您可以在一个查询中执行的操作。
如果您创建具有必要的开始和结束的派生表/内联视图，结束日期，这可用于使用 GROUP BY 在单个查询中生成所需的结果。 MySQL 没有递归函数，因此您必须使用 NUMBERS 表技巧来生成日期...
1. 创建一个仅保存递增数字的表 - 使用 auto_increment 很容易做到：
```
如果存在`example`则删除表。`numbers`;
创建表“示例”。“数字”(
 `id` int(10) 无符号 NOT NULL 自动增量，
  主键（`id`）
) 引擎=InnoDB 默认字符集=latin1;
```
2. 使用以下方式填充表格：
```
插入数字 (id)
值（空）
```
  ...您需要的任意数量的值。
3. 使用 DATE_ADD 构建日期列表，根据 NUMBERS.id 值增加天数。
```
SELECT x.start_dt,
       x.end_dt
  FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
               DATE_ADD(NOW(), INTERVAL n.id + 6 天) AS end_dt
          来自“数字”n
         WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
```
4. 根据日期时间部分连接到数据表：
  <前><代码> SELECT x.start_dt,
  x.end_dt,
  COUNT(*) AS 数字
  FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
  DATE_ADD(NOW(), INTERVAL n.id + 6 天) AS end_dt
  来自“数字”n
  WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
  在 x.start_dt 之间的 c.available_date 上加入客户 c
  和 x.end_dt
  按 x.start_dt、x.end_dt 分组
不要使用对实际列数据执行的函数 - IE：DATEDIFF(NOW(), *available_date*)< /code> - 因为数据库无法在 available_date 列上使用索引（如果存在），因为数据已更改为远离索引值。

Never do in multiple queries, what you can do in one.

If you create a derived table/inline view that has the necessary start & end dates, this can be used to produce the desired result in a single query using a GROUP BY. MySQL doesn't have a recursive function, so you have to use the NUMBERS table trick to generate dates...

Create a table that only holds incrementing numbers - easy to do using an auto_increment:

DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE  `example`.`numbers` (
 `id` int(10) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Populate the table using:
```
INSERT INTO NUMBERS (id)
VALUES (NULL)
```
...for as many values as you need.

Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value.

SELECT x.start_dt,
       x.end_dt
  FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
               DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
          FROM `numbers` n
         WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x

JOIN onto your table of data based on the datetime portion:

  SELECT x.start_dt,
         x.end_dt,
         COUNT(*) AS num
    FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                 DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
            FROM `numbers` n
           WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
    JOIN CUSTOMERS c ON c.available_date BETWEEN x.start_dt
                                             AND x.end_dt
GROUP BY x.start_dt, x.end_dt

Don't use functions performed upon actual column data - IE: DATEDIFF(NOW(), *available_date*) - because a database can't use an index (if one exists) on the available_date column because the data has been altered away from the index value.