什么查询最快?
我有三种方法来构造查询:
第一种:
select obj from table1 where condition1 and obj in (
select obj from table2 where condition2 and obj in (
select obj from table3 where condition3 and obj in (
...
)))
第二种:
select obj from table1 where condition1
and obj in (select obj from table2 where condition2)
and obj in (select obj from table3 where condition3)
...
第三种:
select table1.obj from table1
inner join table2 on table2.obj = table1.obj and table2.condition='condition2'
inner join table3 on table3.obj = table2.obj and table3.condition='condition3'
...
where table1.condition='condition1'
我的问题是这些查询是否提供相同的结果以及这些查询是否同样最优。
我非常确定前两个查询会产生相同的输出,但第二个查询速度更快。我不确定第三个查询。
添加
还有另一种选择:
select table1.obj from table1
inner join table2 on table2.obj = table1.obj
inner join table3 on table3.obj = table2.obj
...
where
table1.condition='condition1' and
table2.condition='condition2' and
table3.condition='condition3'
I have three ways to construct my query:
The first one:
select obj from table1 where condition1 and obj in (
select obj from table2 where condition2 and obj in (
select obj from table3 where condition3 and obj in (
...
)))
The second one:
select obj from table1 where condition1
and obj in (select obj from table2 where condition2)
and obj in (select obj from table3 where condition3)
...
The third one:
select table1.obj from table1
inner join table2 on table2.obj = table1.obj and table2.condition='condition2'
inner join table3 on table3.obj = table2.obj and table3.condition='condition3'
...
where table1.condition='condition1'
My questions is if these queries provide the same result and if these query are equally optimal.
I am pretty sure that the first two queries produce the same output but the second query is faster. Am not sure about the third query.
ADDED
There is also another option:
select table1.obj from table1
inner join table2 on table2.obj = table1.obj
inner join table3 on table3.obj = table2.obj
...
where
table1.condition='condition1' and
table2.condition='condition2' and
table3.condition='condition3'
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
尽管总有例外,但选项 3 几乎肯定是最佳/首选。根据您的索引和数据分布,MySQL 查询执行计划器将处理从表中提取的顺序。
在其他情况下,子查询(选项 1 和 2)针对外部查询的每一行执行 - 它们的效率可能非常低。因此,按照前面的说法,嵌套子查询(选项 1)可能比使用一阶子查询(选项 2)或普通联接(选项 3)差很多。
请注意,对于
INNER JOIN
来说,如果额外条件位于JOIN
子句或WHERE
子句。因此,您的其他选项实际上等同于选项 3。Although there are always exceptions, Option 3 is almost surely the best/first choice. Depending on your indexes and data distributions, the MySQL query execution planner will handle which order to pull from the tables.
In the other cases, subqueries (Options 1 and 2) are executed for every row of the outer query -- they can be terribly inefficient. So, following the previous statement, nested subqueries (Option 1), can be exponentially worse than using first order subqueries (Option 2) or normal joins (Option 3).
Note that for
INNER JOIN
s, it doesn't matter with respect to performance or functionality if the extra conditions are in theJOIN
clauses or in theWHERE
clauses. Consequently, your other option is effectively equivalent to the Option 3.基本上执行查询的方式如下:FROM(获取全部数据)=> WHERE(应用约束)=>SELECT(显示结果)
JOIN 子句始终是连接数据的最佳选择,因为在 WHERE 子句中,仅测试与 JOIN 子句匹配的数据。
在 FROM 中,在获取 WHERE 和 SELECT 的全部数据之前,仅选择并测试 ON 子句中的字段。
在您的第一个示例中,对于每个 SELECT,都会选择表的全部内容来测试 WHERE 子句。
在 WHERE 子句中加入几乎是同样的问题。
最后两个例子似乎是相同的。我更喜欢最后一个,因为第三个示例中使用 ON 子句对于 RIGHT 或 LEFT JOIN 很有用,但在这种情况下,它只会给子句带来混乱:FROM:获取准确的资源,WHERE:应用约束。
这个解释非常示意性,但我希望它有意义......
Basicaly the way of execute a query is the following : FROM (get the whole data) => WHERE (apply constraints) =>SELECT (display results)
JOIN clause are always the best choice for joining data because in the WHERE clause, only data that matches the JOIN clauses are tested.
In the FROM, only field on the ON clause are selected and tested before getting the whole data for the WHERE and the SELECT.
In your both first examples, for each SELECT the whole content of tables is selected for the tests of the WHERE clause.
It is almost the same problem for joining in the WHERE clause.
The both last examples seems to be the same. I would prefer the last one because the use of the ON clause in the third example is useful for RIGHT or LEFT JOIN but in this case, it just brings mess in the clauses'use : FROM : get the accurate resource, WHERE : apply constraints.
This explanation is very schematic but I hope it makes sense ...
我怀疑第三个查询会是最快的。 SQL 经过优化,使 JOIN 成为快速操作。
但了解您的数据的唯一方法就是尝试并查看。
I would suspect the third query would be fastest. SQL is optimized to make JOINs a fast operation.
But the only way to know with YOUR data is to try them and see.
好吧,首先,这些查询不一定返回相同的结果:
IN() 删除重复项。因此,如果 table2 包含 50 个满足条件 y=foo 的 x 值,则查询 2 将返回比查询 1 多 50 倍的行。这可能是您想要的,也可能不是。如果两个表中的 x 都是 UNIQUE,则查询将给出相同的结果。
这当然是错误的,因为子查询不是依赖子查询。无论如何,它将执行一次子查询,并对它进行散列或排序以消除重复项(根据 IN() 的要求),然后使用结果列表来执行 IN 连接。 MySQL 直到最近的版本都对每一行重新执行子查询,但现在不再是这样了。由于 MySQL 不执行哈希 IN 连接,因此它可能仍然比 JOIN 慢得多。
Well, first, those queries don't necessarily return the same results :
The IN() removes duplicates. So, if table2 contains 50 values of x satisfying condition y=foo, query 2 will return 50x more rows than query 1. That may be what you want, or not. If x is UNIQUE in both tables, the queries will give the same results.
This is of course wrong, since the subqueries are not dependent subqueries. Anyway, it will execute the subquery once, and hash it or sort it to eliminate duplicates (as required by IN()) then use the resulting list to perform an IN-join. MySQL until recent versions did re-execute the subquery for each row, this is no longer the case. Since MySQL doesn't do hash IN joins though, it might still be much slower than a JOIN.