SQL:连接中的 where 与 on
也许是一个愚蠢的问题,但考虑一下这两个表:
T1
Store Year
01 2009
02 2009
03 2009
01 2010
02 2010
03 2010
T2
Store
02
为什么这个 INNER JOIN 给出了我想要的结果(过滤 ON 子句中的 [year]):
select t1.*
from t1
inner join t2
on t1.store = t2.store
and t1.[year] = '2009'
Store Year
02 2009
以及为什么 LEFT OUTER JOIN 包括 2010 年的记录?
select t1.*
from t1
left outer join t2
on t1.store = t2.store
and t1.year = '2009'
where t2.store is null
01 2009
03 2009
01 2010
02 2010
03 2010
我必须在“WHERE”子句中编写[年份]过滤器:
select t1.*
from t1
left outer join t2
on t1.store = t2.store
where t2.store is null
and t1.year = '2009'
01 2009
03 2009
就像我说的,也许是一个愚蠢的问题,但它困扰着我!
Perhaps a dumb question, but consider these 2 tables :
T1
Store Year
01 2009
02 2009
03 2009
01 2010
02 2010
03 2010
T2
Store
02
Why is this INNER JOIN giving me the results I want (filtering the [year] in the ON clause) :
select t1.*
from t1
inner join t2
on t1.store = t2.store
and t1.[year] = '2009'
Store Year
02 2009
And why the LEFT OUTER JOIN include records of year 2010 ?
select t1.*
from t1
left outer join t2
on t1.store = t2.store
and t1.year = '2009'
where t2.store is null
01 2009
03 2009
01 2010
02 2010
03 2010
And I have to write the [year] filter in the 'WHERE' clause :
select t1.*
from t1
left outer join t2
on t1.store = t2.store
where t2.store is null
and t1.year = '2009'
01 2009
03 2009
Like I said, perhaps a dumb question, but it's bugging me !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果按照 LEFT JOIN 的定义,
现在意识到,exist 相当于 ON 条件的计算结果为 true,这一切都有意义。
对于 t1.year 为 2010 的行,on 表达式计算结果为 false(对于 t1.year = 2010 的所有行,x AND 2010=2009 为 false),但由于它是左联接,因此仍将返回左表中的行(根据定义)。
因此这种条件只能写成where条件而不能写成join条件。
(一般情况下,它不需要是表,而是选择表达式)
编辑:
Erwin 有趣地指出,where 条件可以用
because:
转为 JOIN ,因此只有上表中 join 列为 false 的行才会在 t2.* 字段中返回空值。
我的意思是这种条件不能变成纯连接,因为左连接的工作方式(即使连接条件为假仍然返回记录),就像内连接一样条件(可以变成纯连接,反之亦然)。
关于提议的查询 - 我的建议是不要使用它。
作为一般规则,使用 OR 的查询的性能比使用 AND 条件的查询差(OR 扩展结果集;AND 限制)。这既适用于连接条件,也适用于 where 条件。
您的查询会比在 where 条件中具有 t1.year = '2009' 的查询执行得更差,因为在后面的情况下,可以使用索引(如果存在),因为如果您像您一样加入,那么您基本上是人为地将一个表中的记录与另一个表中的记录连接起来,以便您的 where 条件仅过滤您需要的记录。
只从 t1 中获取以 2009 年开头的记录应该会更有效(假设有一个关于年份的索引,并且选择性足够高,这将在 WHERE 条件下发生)。
这两个有关性能的建议都可以而且应该通过检查查询计划来验证。
最后,该查询有点晦涩 - 在 t1.year = '2009' 的情况下,连接变成笛卡尔积(稍后被过滤掉)并不是很明显。因此,如果假设简单的 LEFT JOIN 与
where t1.year = 2009 AND t2.store is null
执行效果更好且更具可读性,我就不会使用此查询。If you go by the definition of the LEFT JOIN that
Now realize that the exist is equivalent to ON condition evaluates to true and it all makes sense.
For row where t1.year is 2010 the on expression evaluates to false (x AND 2010=2009 is false for all rows where t1.year = 2010), but since it is a left join the row from the left table will still be returned (according to the definition).
Therefore this kind of condition can only be written as where condition and not join condition.
(in general case it need not be table but a select expression)
EDIT:
As Erwin interestingly pointed out, the where condition can be turned into JOIN with
because:
so only the rows in which join column in the above table is false will return nulls in t2.* fields.
What I meant to say is that this kind of condition can not be turned into a pure join because the way left join works (still returns records even when join condition is false), as is possible with inner join conditions (can be turned into pure joins and vice versa).
Regarding the proposed query - my advice would be not to use it.
Queries with ORs perform worse then queries with AND conditions as a general rule (OR expand the result set; ANDs restrict). This applies both to join conditions and to where conditions.
Your query would perform worse then query that has t1.year = '2009' in the where condition, because in later case it would be possible to use an index (if one exists), because if you join like you do, you are basically artificially joining records from one table with records from another just so that your where condition filter only the records you need.
It should be more effective to only get the records from t1 that have 2009 to begin with (assuming there's an index on year and selectivity is high enough that will happen with the WHERE condition).
Both of these suggestions regarding performance can and should be verified by inspecting the query plan.
Finally, the query is a bit obscure - it is not immediately obvious that in case of t1.year = '2009' the join becomes Cartesian product (which is filtered out later). So, if the assumption that simple
LEFT JOIN
withwhere t1.year = 2009 AND t2.store is null
perform better AND is more readable, I would not use this query.您的第二个查询返回第一个表中的所有结果,除了连接中匹配的结果(其结果可以在第一个查询中看到)。如果您去掉“其中 t2.store 是null”子句,并将“t2.Store”添加到所选列。您应该看到的结果是:
正如您所看到的,如果您随后过滤到 t2.store 为 null 的行,那么您所做的就是减去第二行(唯一与 join 子句匹配的行。)
Your second query returns all the results from the first table, except the ones matched in the join (whose results can be seen in the first query.) This might be easier for you to visualize if you take away the "where t2.store is null" clause, and add "t2.Store" to the selected columns. The results you should see then are:
As you can see, if you then filter to just the rows where t2.store is null, all you're doing is subtracting the second row (the only one that matched the join clause.)