改善这个慢mysql查询的技巧?
我使用的查询通常在一秒内执行,但有时需要 10-40 秒才能完成。 实际上我并不完全清楚子查询是如何工作的,我只知道它是有效的,因为它为每个 faverprofileid 提供了 15 行。
我正在记录缓慢的查询,它告诉我检查了 5823244 行,这很奇怪,因为在涉及的任何表中都没有接近这么多行(收藏夹表最多有 50,000 行)。
有人能给我一些指点吗? 子查询有问题并且需要使用文件排序吗?
编辑:运行解释显示用户表没有使用索引(即使 id 是主键)。 在额外的下面写着:使用临时的; 使用文件排序。
SELECT F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F
INNER JOIN users AS U ON F.faver_profile_id = U.id
INNER JOIN items AS I ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;
I'm using a query which generally executes in under a second, but sometimes takes between 10-40 seconds to finish. I'm actually not totally clear on how the subquery works, I just know that it works, in that it gives me 15 rows for each faverprofileid.
I'm logging slow queries and it's telling me 5823244 rows were examined, which is odd because there aren't anywhere close to that many rows in any of the tables involved (the favorites table has the most at 50,000 rows).
Can anyone offer me some pointers? Is it an issue with the subquery and needing to use filesort?
EDIT: Running explain shows that the users table is not using an index (even though id is the primary key). Under extra it says: Using temporary; Using filesort.
SELECT F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F
INNER JOIN users AS U ON F.faver_profile_id = U.id
INNER JOIN items AS I ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
检查的行数表示很大,因为许多行已被检查多次。 您出现此问题的原因是查询计划优化不正确,这会导致本应执行索引查找时进行表扫描。 在这种情况下,检查的行数是指数的,即与多个表中的总行数的乘积相当的数量级。
您还可以更改查询以使用
GROUP BY faver_profile_id
/HAVING count >; 15
而不是嵌套的SELECT COUNT(*)
子查询,如vartec
建议的那样。 如果您的原始查询和vartec
的查询都经过适当优化,例如使用提示(您的查询将使用嵌套索引查找,而vartec
的查询),则它们的性能应该具有可比性将使用基于哈希的策略。)The number of rows examined represents is large because many rows have been examined more than once. You are getting this because of an incorrectly optimized query plan which results in table scans when index lookups should have been performed. In this case the number of rows examined is exponential, i.e. of an order of magnitude comparable to the product of the total number of rows in more than one table.
You may also change your query to use
GROUP BY faver_profile_id
/HAVING count > 15
instead of the nestedSELECT COUNT(*)
subquery, as suggested byvartec
. The performance of both your original andvartec
's query should be comparable if both are properly optimized e.g. using hints (your query would use nested index lookups, whereasvartec
's query would use a hash-based strategy.)我认为使用
GROUP BY
和HAVING
应该会更快。那是你要的吗?
不知道您需要
items
中的哪些字段,因此我添加了占位符。I think with
GROUP BY
andHAVING
it should be faster.Is that what you want?
Don't know which fields from
items
you need, so I've put placeholders.我建议你使用Mysql解释查询来看看你的mysql如何服务器处理查询。 我的赌注是你的指数不是最佳的,但解释应该比我的赌注做得更好。
I suggest you use Mysql Explain Query to see how your mysql server handles the query. My bet is your indexes aren't optimal, but explain should do much better than my bet.
您可以对每个 id 执行循环并使用 limit 而不是 count(*) 子查询:
You could do a loop on each id and use limit instead of the count(*) subquery:
我假设该查询的结果旨在显示为分页列表。 在这种情况下,也许您可以考虑执行更简单的“未联接查询”,并对每行执行第二次查询以仅读取显示的 15、20 或 30 个元素。 JOIN 不是一个繁重的操作吗? 这将简化查询,并且当连接的表增长时它不会变得更慢。
如果我错了请告诉我。
I'll suppose the result of that query is intented to be shown as a paged list. In that case, perhaps you could consider to do a simpler "unjoined query" and do a second query for each row to read only the 15, 20 or 30 elements shown. Was not a JOIN a heavy operation? This would simplify the query and It wouldn't become slower when the joined tables grow.
Tell me if I'm wrong, please.