使用 COUNT 进行子查询的 MYSQL 查询速度较慢
是的,我不知道为什么,但是这个查询需要 6 秒多的时间来执行,索引都设置正确,如果我单独运行每个查询,它的执行时间不到 0.5 秒,效果很好。
这是查询
SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
(SELECT COUNT(*)
FROM supplier_questions q1
WHERE c.supplier_id = q1.supplier_id AND q1.incomplete = '0') AS questions,
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,
(SELECT COUNT(*)
FROM supplier_questions q2
WHERE c.supplier_id = q2.supplier_id AND q2.reviewed = '1') AS reviewed,
questapproved,
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20
解释查询的结果如下
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ss ref site_id,supplier_id site_id 4 const 1287 Using where; Using temporary; Using filesort
1 PRIMARY c eq_ref PRIMARY PRIMARY 4 ss.supplier_id 1
3 DEPENDENT SUBQUERY q2 ref supplier_id,reviewed reviewed 4 const 263 Using where
2 DEPENDENT SUBQUERY q1 ref supplier_id,incomplete incomplete 4 const 254 Using where
计数查询在那里的原因是因为我需要知道这些表中的行数,这不能在另一个查询中完成,因为结果也需要按这些值排序:(
Right I have no idea why but this query takes well over 6 seconds to execute, index's are all setup correctly and if I run each query separately it works great with less than 0.5 seconds to execute.
Here is the query
SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
(SELECT COUNT(*)
FROM supplier_questions q1
WHERE c.supplier_id = q1.supplier_id AND q1.incomplete = '0') AS questions,
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,
(SELECT COUNT(*)
FROM supplier_questions q2
WHERE c.supplier_id = q2.supplier_id AND q2.reviewed = '1') AS reviewed,
questapproved,
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20
Results of the Explain query is as follows
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ss ref site_id,supplier_id site_id 4 const 1287 Using where; Using temporary; Using filesort
1 PRIMARY c eq_ref PRIMARY PRIMARY 4 ss.supplier_id 1
3 DEPENDENT SUBQUERY q2 ref supplier_id,reviewed reviewed 4 const 263 Using where
2 DEPENDENT SUBQUERY q1 ref supplier_id,incomplete incomplete 4 const 254 Using where
The reason the count queries are in there is because I need to know the number of rows from those tables, this can't be done in another query as the results also need to be sorted by those values :(
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
作为黑暗中的刺杀,这运行得更快吗? (我没有 mysql 来验证语法,所以请原谅任何轻微的错误,但你可能会明白)
As a stab in the dark, does this run faster? (I havent got a mysql to verify the syntax on, so forgive any slight mistakes, but you might get the idea)
由于自动生成的主键永远不会等于 0(除非数据库设计错误),您可以删除 c.supplier_id != '0' 子句。
ss.site_id = '2' 应处于 JOIN 条件以提高可读性。
看起来这应该只匹配每个供应商表中的一行(如果这是您通常的 1-N 事物地址关系,即您选择每个供应商的第二个地址,也许“2”对应于“帐单地址”)或其他)所以 GROUP BY c.supplier_id 是无用的。如果 GROUP BY 实际上做了一些事情,那么查询就是错误的,因为“地址”列(可能来自 seller_site 表)将来自随机行。
所以这是简化的 FROM (WHERE 消失了):
我想您在 c.supplier_name 上有一个索引,所以这部分查询应该非常快。
现在尝试这个查询:
Since autogenerated primary keys are never equal to 0 (unless big db design mistake) you can drop the c.supplier_id != '0' clause.
ss.site_id = '2' should be in the JOIN condition for readability.
It looks like this should match only one row in table supplier_site per supplier (if this is your usual 1-N thing-addresses relation, ie you're selecting the second address of each supplier, maybe '2' corresponds to 'billing address' or something) so the GROUP BY c.supplier_id is useless. If the GROUP BY actually does something, then the query is wrong, since the "address" columns, which presumably come from supplier_site table, would come from a random row.
So here's the simplified FROM (the WHERE is gone) :
I suppose you got an index on c.supplier_name so this part of the query should be very fast.
Now try this query :
如果删除子选择,您最终会得到如下结果:
我没有可用的 MySQL 数据库,因此我的 SQL 中可能存在错误。
这个想法是删除子查询并用外连接替换它们
并使用 IF 只计算相关行。
If you remove the sub-selects you end up with something like this:
I don't have a MySQL database available so there may be errors in my SQL.
The idea is to remove the subqueries and replace them with an outer join
and use IF to only count relevant rows.
我将首先尝试通过供应商预先查询问题数量并审核一次来进行重组。然后,加入其余的细节。通过使用 STRAIGHT_JOIN 关键字,它应该按照显示的顺序进行处理。这将首先进行预聚合,并使用它作为基础来连接回供应商,然后连接到供应商站点。不需要外部组,因为它无论如何都基于供应商 ID。然而,加入供应商站点(您的 ss.supplier_no)将意味着供应商拥有多个地点。这是否意味着地址和活动状态列源自该表?
问题的连接是否应该与特定供应商及其相应的站点位置相关联?
此外,由于预查询在 seller_id != '0' 上有 WHERE 子句,因此下游不需要它,因为这将是与其他表正常联接的基础,从而将它们从结果集中消除。
I would first attempt restructure by pre-querying the aggregates by supplier the count of questions and reviewed ONCE. Then, join to the rest of the details. By using the STRAIGHT_JOIN keyword, it should process in the order displayed. This will pre-aggregate first and use THAT as the basis to join back to suppliers and then supplier sites. No outer group by needed since its based on a supplier ID anyhow. However, the join to supplier_sites (your ss.supplier_no) would imply a supplier has more than one location. Does that mean the address and active status columns are originating from that table?
Should the join of questions be associated with a specific supplier and it's corresponding site location or not?
Additionally, since the prequery has the WHERE clause on supplier_id != '0', it's not needed down stream since that will be the basis of a normal join to the other tables, thus eliminating them out of the result set.