如何在大表上快速计数?

发布于 2024-09-26 09:22:57 字数 549 浏览 0 评论 0原文

我有包含数十万行的大型 MySQL 表。

我需要在客户表上编写一个查询,以获取客户可以再次联系的时间。

例如。

SELECT 'This week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7

UNION

SELECT 'Next week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7 
    AND DATEDIFF(NOW(), available_date) < 14

UNION

... (a few more like this)

在不同的大型表上编写类似的查询后,我确实注意到将引擎从 InnoDB 更改为 MyISAM 大大加快了查询速度(这些表不需要 InnoDB,因为它们没有外键检查)。我还能做些什么来加快这样的计数(除了为适当的字段建立索引之外)吗?

I have large MySQL tables with hundreds of thousands of rows.

I need to write a query on a customers table which gets the count of when customers will be available to contact again.

eg.

SELECT 'This week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7

UNION

SELECT 'Next week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7 
    AND DATEDIFF(NOW(), available_date) < 14

UNION

... (a few more like this)

Having written a similar query on a different large table, I did notice that changing the engine from InnoDB to MyISAM sped up the query considerably (InnoDB is not needed for these tables as they do not have foreign key checks). Is there anything else I can do to speed up counts like this (other than indexing the appropriate fields)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

生来就爱笑 2024-10-03 09:22:57

WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()

这样做应该让 MySQL 使用在 (sales_person_id, available_date) 上创建的复合索引列(使用 EXPLAIN 检查)

WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()

Doing it this way should let MySQL use a composite index created on (sales_person_id, available_date) columns (use EXPLAIN to check)

夏了南城 2024-10-03 09:22:57
  1. 永远不要在多个查询中执行您可以在一个查询中执行的操作。

    如果您创建具有必要的开始和结束的派生表/内联视图,结束日期,这可用于使用 GROUP BY 在单个查询中生成所需的结果。 MySQL 没有递归函数,因此您必须使用 NUMBERS 表技巧来生成日期...

    1. 创建一个仅保存递增数字的表 - 使用 auto_increment 很容易做到:

      如果存在`example`则删除表。`numbers`;
      创建表“示例”。“数字”(
       `id` int(10) 无符号 NOT NULL 自动增量,
        主键(`id`)
      ) 引擎=InnoDB 默认字符集=latin1;
      
    2. 使用以下方式填充表格:

      插入数字 (id)
      值(空)
      

      ...您需要的任意数量的值。

    3. 使用 DATE_ADD 构建日期列表,根据 NUMBERS.id 值增加天数。

      SELECT x.start_dt,
             x.end_dt
        FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                     DATE_ADD(NOW(), INTERVAL n.id + 6 天) AS end_dt
                来自“数字”n
               WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
      
    4. 根据日期时间部分连接到数据表:

      <前><代码> SELECT x.start_dt,
      x.end_dt,
      COUNT(*) AS 数字
      FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
      DATE_ADD(NOW(), INTERVAL n.id + 6 天) AS end_dt
      来自“数字”n
      WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
      在 x.start_dt 之间的 c.available_date 上加入客户 c
      和 x.end_dt
      按 x.start_dt、x.end_dt 分组

  2. 不要使用对实际列数据执行的函数 - IE:DATEDIFF(NOW(), *available_date*)< /code> - 因为数据库无法在 available_date 列上使用索引(如果存在),因为数据已更改为远离索引值。

  1. Never do in multiple queries, what you can do in one.

    If you create a derived table/inline view that has the necessary start & end dates, this can be used to produce the desired result in a single query using a GROUP BY. MySQL doesn't have a recursive function, so you have to use the NUMBERS table trick to generate dates...

    1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

      DROP TABLE IF EXISTS `example`.`numbers`;
      CREATE TABLE  `example`.`numbers` (
       `id` int(10) unsigned NOT NULL auto_increment,
        PRIMARY KEY  (`id`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      
    2. Populate the table using:

      INSERT INTO NUMBERS (id)
      VALUES (NULL)
      

      ...for as many values as you need.

    3. Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value.

      SELECT x.start_dt,
             x.end_dt
        FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                     DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
                FROM `numbers` n
               WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
      
    4. JOIN onto your table of data based on the datetime portion:

        SELECT x.start_dt,
               x.end_dt,
               COUNT(*) AS num
          FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                       DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
                  FROM `numbers` n
                 WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
          JOIN CUSTOMERS c ON c.available_date BETWEEN x.start_dt
                                                   AND x.end_dt
      GROUP BY x.start_dt, x.end_dt
      
  2. Don't use functions performed upon actual column data - IE: DATEDIFF(NOW(), *available_date*) - because a database can't use an index (if one exists) on the available_date column because the data has been altered away from the index value.

—━☆沉默づ 2024-10-03 09:22:57

重点关注 WHERE 子句。

  • WHERE 子句中的字段是否有索引?
  • 您可以用常量替换 datediff() 函数吗,它正在针对每一行进行评估。

Focus on the WHERE clause.

  • Is there index on fields in the WHERE clause?
  • Can you replace the datediff() function with a constant, it is being evaluated for every row.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文