我应该在 InnoDB 中避免一起使用 COUNT 吗?
现在,我正在争论是否使用 COUNT(id) 或“count”列。我听说 InnoDB COUNT 没有 WHERE 子句会非常慢,因为它需要锁定表并进行全索引扫描。使用 WHERE
子句时的行为是否相同?
例如,如果我有一个包含 100 万条记录的表。执行不带 WHERE
子句的 COUNT
将需要使用索引查找 100 万条记录。如果添加 WHERE
子句将匹配条件的行数从 100 万减少到 500,000,查询是否会显着加快?
考虑 SO 上的“徽章”页面,在 badges
表中添加一个名为 count
的列,并在用户获得该特定徽章时递增该列,这比执行 更快>从 user_badges 中选择 COUNT(id),其中 user_id = 111
?
使用 MyIASM 不是一个选择,因为我需要 InnoDB 的功能来维护数据完整性。
Right now, I'm debating whether or not to use COUNT(id)
or "count" columns. I heard that InnoDB COUNT
is very slow without a WHERE
clause because it needs to lock the table and do a full index scan. Is that the same behavior when using a WHERE
clause?
For example, if I have a table with 1 million records. Doing a COUNT
without a WHERE
clause will require looking up 1 million records using an index. Will the query become significantly faster if adding a WHERE
clause decreases the number of rows that match the criteria from 1 million to 500,000?
Consider the "Badges" page on SO, would adding a column in the badges
table called count
and incrementing it whenever a user earned that particular badge be faster than doing a SELECT COUNT(id) FROM user_badges WHERE user_id = 111
?
Using MyIASM is not an option because I need the features of InnoDB to maintain data integrity.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
SELECT COUNT(*) FROM tablename 似乎进行了全表扫描。
SELECT COUNT(*) FROM tablename USE INDEX (colname) 似乎相当快,如果
可用索引为 NOT NULL、UNIQUE 和固定长度。非 UNIQUE 索引没有多大帮助(如果有的话)。可变长度索引 (VARCHAR) 似乎更慢,但这可能只是因为索引物理上更大。可以快速计数整数 UNIQUE NOT NULL 索引。这是有道理的。
MySQL确实应该自动执行这种优化。
SELECT COUNT(*) FROM tablename seems to do a full table scan.
SELECT COUNT(*) FROM tablename USE INDEX (colname) seems to be quite fast if
the index available is NOT NULL, UNIQUE, and fixed-length. A non-UNIQUE index doesn't help much, if at all. Variable length indices (VARCHAR) seem to be slower, but that may just be because the index is physically larger. Integer UNIQUE NOT NULL indices can be counted quickly. Which makes sense.
MySQL really should perform this optimization automatically.
只要您使用了索引,
COUNT()
的性能就很好。如果您有一百万条记录,并且相关列是
NON NULL
,那么COUNT()
将很容易成为一百万条记录。如果允许 NULL 值,则不会对这些值建立索引,因此可以通过查看索引大小轻松获取记录数。如果您没有指定
WHERE
子句,那么最坏的情况就是将使用主键索引。如果您指定
WHERE
子句,只需确保列已建立索引即可。Performance of
COUNT()
is fine as long as you have an index that's used.If you have a million records and the column in question is
NON NULL
then aCOUNT()
will be a million quite easily. IfNULL
values are allowed, those aren't indexed so the number of records is easily obtained by looking at the index size.If you're not specifying a
WHERE
clause, then the worst case is the primary key index will be used.If you specify a
WHERE
clause, just make sure the column(s) are indexed.我不会说避免,但这取决于您想要做什么:
如果您只需要提供估计,您可以执行 SELECT MAX(id) FROM table。这要便宜得多,因为它只需要读取索引中的最大值。
如果我们考虑您给出的徽章示例,InnoDB 只需要计算用户拥有的徽章数量(假设 user_id 上有索引)。我想说,在大多数情况下,不会超过 10-20 个,而且根本没有太大危害。
这确实取决于具体情况。我可能会将某人在主用户表上拥有的徽章数量保留为一列(count_badges_awarded),因为每次显示头像时,该数字也会显示。它使我不必执行 2 个查询。
I wouldn't say avoid, but it depends on what you are trying to do:
If you only need to provide an estimate, you could do SELECT MAX(id) FROM table. This is much cheaper, since it just needs to read the max value in the index.
If we consider the badges example you gave, InnoDB only needs to count up the number of badges that user has (assuming an index on user_id). I'd say in most case that's not going to be more than 10-20, and it's not much harm at all.
It really depends on the situation. I probably would keep the count of the number of badges someone has on the main user table as a column (count_badges_awarded) simply because every time an avatar is shown, so is that number. It saves me having to do 2 queries.