如何加快已经使用索引的 count(*) 速度? (MyISAM)

发布于 2024-08-05 04:18:58 字数 1889 浏览 9 评论 0原文

我有 3 个大表(10k、10k 和 100M 行),我试图对它们的连接进行简单的计数,其中所有连接的列都被索引。为什么 COUNT(*) 需要这么长时间,如何加快速度(没有触发器和运行摘要)?

mysql> describe SELECT COUNT(*) FROM `metaward_alias` INNER JOIN `metaward_achiever` ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) INNER JOIN `metaward_award` ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) WHERE `metaward_award`.`owner_id` = 8;
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
| id | select_type | table             | type   | possible_keys                                         | key                        | key_len | ref                             | rows | Extra       |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
|  1 | SIMPLE      | metaward_award    | ref    | PRIMARY,metaward_award_owner_id                       | metaward_award_owner_id    | 4       | const                           | 1552 |             | 
|  1 | SIMPLE      | metaward_achiever | ref    | metaward_achiever_award_id,metaward_achiever_alias_id | metaward_achiever_award_id | 4       | paul.metaward_award.id          | 2498 |             | 
|  1 | SIMPLE      | metaward_alias    | eq_ref | PRIMARY                                               | PRIMARY                    | 4       | paul.metaward_achiever.alias_id |    1 | Using index | 
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
3 rows in set (0.00 sec)

但实际运行查询大约需要 10 分钟,而且我使用的是 MyISAM,因此表在这段时间内被完全锁定

I have a 3 large tables (10k, 10k, and 100M rows) and am trying to do a simple count on a join of them, where all the joined columns are indexed. Why does the COUNT(*) take so long, and how can I speed it up (without triggers and a running summary)?

mysql> describe SELECT COUNT(*) FROM `metaward_alias` INNER JOIN `metaward_achiever` ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) INNER JOIN `metaward_award` ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) WHERE `metaward_award`.`owner_id` = 8;
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
| id | select_type | table             | type   | possible_keys                                         | key                        | key_len | ref                             | rows | Extra       |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
|  1 | SIMPLE      | metaward_award    | ref    | PRIMARY,metaward_award_owner_id                       | metaward_award_owner_id    | 4       | const                           | 1552 |             | 
|  1 | SIMPLE      | metaward_achiever | ref    | metaward_achiever_award_id,metaward_achiever_alias_id | metaward_achiever_award_id | 4       | paul.metaward_award.id          | 2498 |             | 
|  1 | SIMPLE      | metaward_alias    | eq_ref | PRIMARY                                               | PRIMARY                    | 4       | paul.metaward_achiever.alias_id |    1 | Using index | 
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
3 rows in set (0.00 sec)

But actually running the query takes about 10 minutes, and I'm on MyISAM so the tables are fully locked down for that duration

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

青巷忧颜 2024-08-12 04:18:58

我猜原因是您对三个表进行了巨大的联接(如果不首先应用 where 子句,结果将是 10k * 10k * 100M = 1016 行)。尝试重新排序连接(例如从 metaward_award 开始,然后仅连接 metaward_achiever 看看需要多长时间,然后尝试插入 metaward_alias,可能使用子查询强制您首选的评估顺序)。

如果这没有帮助,您可能必须对数据进行非规范化,例如通过存储特定 metaward_achiever 的别名数量。然后你就可以完全摆脱一个连接。也许您甚至可以缓存 metaward_award 的总和,具体取决于数据更新的方式和频率。

其他可能有帮助的事情是将所有数据库内容放入 RAM :-)

I guess the reason is that you do a huge join over three tables (without applying where clause first, the result would be 10k * 10k * 100M = 1016 rows). Try to reorder joins (for example start with metaward_award, then join only metaward_achiever see how long that takes, then try to plug metaward_alias, possibly using subquery to force your preferred evaluation order).

If that does not help you might have to denormalize your data, for example by storing number of aliases for particular metaward_achiever. Then you'd get rid of one join altogether. Maybe you can even cache the sums for metaward_award, depending on how and how often is your data updated.

Other thing that might help is getting all your database content into RAM :-)

一袭白衣梦中忆 2024-08-12 04:18:58

确保你有索引:

metaward_alias      id
metaward_achiever   alias_id
metaward_achiever   award_id
metaward_award      id
metaward_award      owner_id

我相信很多人也会建议依赖特定的列,但在 MySql 中这对你的查询没有任何影响。

更新

您还可以尝试在主表而不是连接表之一上设置条件。这会给你相同的结果,但它可能会更快(我不知道 MySql 有多聪明):

SELECT COUNT(*) FROM `metaward_award` 
   INNER JOIN `metaward_achiever` 
      ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) 
   INNER JOIN `metaward_alias` 
      ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) 
WHERE `metaward_award`.`owner_id` = 8

Make sure you have indexes on:

metaward_alias      id
metaward_achiever   alias_id
metaward_achiever   award_id
metaward_award      id
metaward_award      owner_id

I'm sure many people will also suggest to count on a specific column, but in MySql this doesn't make any difference for your query.

UPDATE:

You could also try to set the condition on the main table instead of one of the joined tables. That would give you the same result, but it could be faster (I don't know how clever MySql is):

SELECT COUNT(*) FROM `metaward_award` 
   INNER JOIN `metaward_achiever` 
      ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) 
   INNER JOIN `metaward_alias` 
      ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) 
WHERE `metaward_award`.`owner_id` = 8
树深时见影 2024-08-12 04:18:58

10 分钟对于该查询来说太长了。我想你的密钥缓存一定很小。您可以通过以下方式获取其大小(以字节为单位):

SELECT @@key_buffer_size

首先,您应该运行 分析表优化表 。他们会对您的索引进行排序,并可以稍微提高性能。

您还应该看看是否可以为您的列使用更紧凑的类型。例如,如果您不打算拥有超过 1600 万个所有者或奖项或别名,则可以将 INT 列更改为 MEDIUMINT(当然是 UNSIGNED)。在某些情况下甚至可能是 SMALLINT?这将减少您的索引占用空间,并且您将在缓存中容纳更多索引。

10 minutes is way too long for that query. I think you must have a really small key cache. You can get its size in bytes with:

SELECT @@key_buffer_size

First off, you should run ANALYZE TABLE or OPTIMIZE TABLE. They'll sort your index and can slightly improve the performance.

You should also see if you can use more compact types for your columns. For instance, if you're not going to have more than 16 millions owners or awards or aliases, you can change your INT columns into MEDIUMINT (UNSIGNED, of course). Perhaps even SMALLINT in some cases? That will reduce your index footprint and you'll fit more of it in the cache.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文