Mysql - 帮我优化这个查询(改进的问题)
关于系统: - 有导师创建课程和包 - 遵循基于标签的搜索方法。当新导师注册和导师创建包时创建标签关系(这使得导师和包可搜索)。有关详细信息,请查看标签在该系统中如何工作部分?以下。
以下是相关查询
SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
SUM(DISTINCT( t.tag LIKE "%democracy%" )) AS key_2_total_matches,
COUNT(DISTINCT( od.id_od )) AS tutor_popularity,
CASE
WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
AND wc.wc_type = 0
AND wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'classes_published',
CASE
WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'packs_published',
td . *,
u . *
FROM tutor_details AS td
JOIN users AS u
ON u.id_user = td.id_user
LEFT JOIN learning_packs_tag_relations AS lptagrels
ON td.id_tutor = lptagrels.id_tutor
LEFT JOIN learning_packs AS lp
ON lptagrels.id_lp = lp.id_lp
LEFT JOIN learning_packs_categories AS lpc
ON lpc.id_lp_cat = lp.id_lp_cat
LEFT JOIN learning_packs_categories AS lpcp
ON lpcp.id_lp_cat = lpc.id_parent
LEFT JOIN learning_pack_content AS lpct
ON ( lp.id_lp = lpct.id_lp )
LEFT JOIN webclasses_tag_relations AS wtagrels
ON td.id_tutor = wtagrels.id_tutor
LEFT JOIN webclasses AS wc
ON wtagrels.id_wc = wc.id_wc
LEFT JOIN learning_packs_categories AS wcc
ON wcc.id_lp_cat = wc.id_wp_cat
LEFT JOIN learning_packs_categories AS wccp
ON wccp.id_lp_cat = wcc.id_parent
LEFT JOIN order_details AS od
ON td.id_tutor = od.id_author
LEFT JOIN orders AS o
ON od.id_order = o.id_order
LEFT JOIN tutors_tag_relations AS ttagrels
ON td.id_tutor = ttagrels.id_tutor
JOIN tags AS t
ON ( t.id_tag = ttagrels.id_tag )
OR ( t.id_tag = lptagrels.id_tag )
OR ( t.id_tag = wtagrels.id_tag )
WHERE ( u.country = 'IE'
OR u.country IN ( 'INT' ) )
AND CASE
WHEN ( ( t.id_tag = lptagrels.id_tag )
AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( ( t.id_tag = wtagrels.id_tag )
AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
AND wc.wc_type = 0
AND
wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
AND o.order_status = 'paid'
AND CASE
WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
ELSE 1
END
ELSE 1
END
AND ( t.tag LIKE "%Dictatorship%"
OR t.tag LIKE "%Democracy%" )
GROUP BY td.id_tutor
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
ORDER BY tutor_popularity DESC,
u.surname ASC,
u.name ASC
LIMIT 0, 20
问题
上述查询返回的结果是正确的(并且逻辑按预期工作),但查询所花费的时间会随着重量的增加而惊人地增加数据,对于我拥有的当前数据,与 0.005 - 0.0002 秒量级的正常查询时间相比,它大约需要 25 秒,这使得它完全无法使用。
由于尚未对所有可能的字段建立索引,因此可能会导致一些延迟。标签表的标签字段是有索引的。 查询有问题吗? 执行时间超过 20 秒的原因是什么?
标签在该系统中如何工作?
- 当导师注册时,会输入标签并根据导师的详细信息(如姓名等)创建标签关系。
- 当导师创建包时,会再次输入标签并标记关系是根据包的详细信息(例如包名称、描述等)创建的。
- 存储在tutorials_tag_relations中的导师的标签关系以及存储在learning_packs_tag_relations中的包的标签关系。所有单独的标签都存储在标签表中。
解释查询输出:- 请查看此屏幕截图 - http://www.test.examvillage.com/Explain_query.jpg
About the system:
- There are tutors who create classes and packs
- A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below.
Following is the concerned query
SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
SUM(DISTINCT( t.tag LIKE "%democracy%" )) AS key_2_total_matches,
COUNT(DISTINCT( od.id_od )) AS tutor_popularity,
CASE
WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
AND wc.wc_type = 0
AND wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'classes_published',
CASE
WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'packs_published',
td . *,
u . *
FROM tutor_details AS td
JOIN users AS u
ON u.id_user = td.id_user
LEFT JOIN learning_packs_tag_relations AS lptagrels
ON td.id_tutor = lptagrels.id_tutor
LEFT JOIN learning_packs AS lp
ON lptagrels.id_lp = lp.id_lp
LEFT JOIN learning_packs_categories AS lpc
ON lpc.id_lp_cat = lp.id_lp_cat
LEFT JOIN learning_packs_categories AS lpcp
ON lpcp.id_lp_cat = lpc.id_parent
LEFT JOIN learning_pack_content AS lpct
ON ( lp.id_lp = lpct.id_lp )
LEFT JOIN webclasses_tag_relations AS wtagrels
ON td.id_tutor = wtagrels.id_tutor
LEFT JOIN webclasses AS wc
ON wtagrels.id_wc = wc.id_wc
LEFT JOIN learning_packs_categories AS wcc
ON wcc.id_lp_cat = wc.id_wp_cat
LEFT JOIN learning_packs_categories AS wccp
ON wccp.id_lp_cat = wcc.id_parent
LEFT JOIN order_details AS od
ON td.id_tutor = od.id_author
LEFT JOIN orders AS o
ON od.id_order = o.id_order
LEFT JOIN tutors_tag_relations AS ttagrels
ON td.id_tutor = ttagrels.id_tutor
JOIN tags AS t
ON ( t.id_tag = ttagrels.id_tag )
OR ( t.id_tag = lptagrels.id_tag )
OR ( t.id_tag = wtagrels.id_tag )
WHERE ( u.country = 'IE'
OR u.country IN ( 'INT' ) )
AND CASE
WHEN ( ( t.id_tag = lptagrels.id_tag )
AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( ( t.id_tag = wtagrels.id_tag )
AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
AND wc.wc_type = 0
AND
wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
AND o.order_status = 'paid'
AND CASE
WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
ELSE 1
END
ELSE 1
END
AND ( t.tag LIKE "%Dictatorship%"
OR t.tag LIKE "%Democracy%" )
GROUP BY td.id_tutor
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
ORDER BY tutor_popularity DESC,
u.surname ASC,
u.name ASC
LIMIT 0, 20
The problem
The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query rises alarmingly for heavier data and for the current data I have it is like 25 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable.
It is possible that some of the delay is being caused because all the possible fields have not yet been indexed. The tag field of tags table is indexed.
Is there something faulty with the query?
What can be the reason behind 20+ seconds of execution time?
How tags work in this system?
- When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc.
- When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc.
- tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table.
The explain query output:-
Please see this screenshot - http://www.test.examvillage.com/Explain_query.jpg
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可能会看到它是否有助于在以下字段上添加索引:
然后尝试从 where 子句中删除
case-when
结构。您可以直接在连接中添加一些特定于表的限制,例如:
left join t2 on t1.id = t2.id AND ...
以使代码更具可读性。编辑:
似乎您在这里采用了错误的方法:您搜索所有标签,然后计算与搜索匹配的标签。相反,您应该搜索匹配的标签,然后对具有这些标签的结果进行计数。
You may see if it helps adding indexes on following fields:
and then try to get rid of the
case-when
structures from the where clause.You can add some table specific restrictions directly to join like:
left join t2 on t1.id = t2.id AND ...
to make code more readable.EDIT:
Seems you have a wrong approach here: you search for all tags and then count the tags that match search. Instead you should search for tags that match and then count results that have these tags.
我最近也遇到了类似的问题。我必须修改查询来实现新功能,这意味着添加多个联接和左联接。逻辑已正确实现,但对于一些更大的表来说需要很长时间。
正如布莱恩所建议的,解决方案是完全重写。
我的新方法是这样的:
所有这些都是在存储过程中完成的,最终结果已经通过了单元测试,并且速度非常快。
更新
请测试此查询,看看它是否返回与原始结果相同的结果。
如果确实如此,那么我将进一步尝试寻找更好的实现。
I've recently had a similar problem. I had to modify a query to implement a new feature, and that meant adding several joins and left joins. The logic was correctly implemented, but it took forever with some bigger tables.
The solution was a complete rewrite, as Brian suggests.
My new approach was something like this:
All this was done in a stored procedure, the end result has passed unit tests, and is blazing fast.
UPDATE
Please test this query, to see if it returns the same results as the original.
If it does, then I will further try to find a better implementation.
把这个代码从后面拿出来拍一下。
然后重新开始。
我并不是轻率,但这太可怕了,如果你现在就摆脱它,对你自己和将来接触它的任何人来说都是一个很大的帮助。
Take this code out the back and shoot it.
Then start again.
I'm not being flippant but this is horrific and you would do yourself and anyone else touching it in the future a big favour by getting rid of it right now.