SQL:连接中的 where 与 on

发布于 2024-08-28 15:31:01 字数 794 浏览 6 评论 0原文

也许是一个愚蠢的问题,但考虑一下这两个表:

T1                       
Store    Year
01  2009
02  2009
03  2009
01  2010
02  2010
03  2010

T2
Store
02

为什么这个 INNER JOIN 给出了我想要的结果(过滤 ON 子句中的 [year]):

select t1.*
from t1
inner join t2
on t1.store = t2.store
and t1.[year] = '2009'

Store    Year
02  2009

以及为什么 LEFT OUTER JOIN 包括 2010 年的记录?

select t1.*
from t1
left outer join t2
on t1.store = t2.store
and t1.year = '2009'
where t2.store is null

01  2009
03  2009
01  2010
02  2010
03  2010


我必须在“WHERE”子句中编写[年份]过滤器:

select t1.*
from t1
left outer join t2
on t1.store = t2.store
where t2.store is null
and t1.year = '2009'

01  2009
03  2009

就像我说的,也许是一个愚蠢的问题,但它困扰着我!

Perhaps a dumb question, but consider these 2 tables :

T1                       
Store    Year
01  2009
02  2009
03  2009
01  2010
02  2010
03  2010

T2
Store
02

Why is this INNER JOIN giving me the results I want (filtering the [year] in the ON clause) :

select t1.*
from t1
inner join t2
on t1.store = t2.store
and t1.[year] = '2009'

Store    Year
02  2009

And why the LEFT OUTER JOIN include records of year 2010 ?

select t1.*
from t1
left outer join t2
on t1.store = t2.store
and t1.year = '2009'
where t2.store is null

01  2009
03  2009
01  2010
02  2010
03  2010

And I have to write the [year] filter in the 'WHERE' clause :

select t1.*
from t1
left outer join t2
on t1.store = t2.store
where t2.store is null
and t1.year = '2009'

01  2009
03  2009

Like I said, perhaps a dumb question, but it's bugging me !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

空心空情空意 2024-09-04 15:31:01

如果按照 LEFT JOIN 的定义,

  • 对于左侧表中的每一行,如果存在右侧表中的每一行,它将返回右侧表中的匹配行;如果
  • 右侧不存在行,它仍然会从左侧返回一行右侧表中的所有列都设置为 NULL

现在意识到,exist 相当于 ON 条件的计算结果为 true,这一切都有意义。

对于 t1.year 为 2010 的行,on 表达式计算结果为 false(对于 t1.year = 2010 的所有行,x AND 2010=2009 为 false),但由于它是左联接,因此仍将返回左表中的行(根据定义)。

因此这种条件只能写成where条件而不能写成join条件。

(一般情况下,它不需要是表,而是选择表达式)

编辑:
Erwin 有趣地指出,where 条件可以用

select t1.*
from t1
left outer join t2
on t1.store = t2.store
or t1.year = '2009'
where t2.store is null

because:

t1.store t1.year t2.store t1.store=t2.store t1.year=2009  join(OR)
01       2009    02       false             true          true
02       2009    02       true              true          true
03       2009    02       false             true          true
01       2010    02       false             false         false
02       2010    02       true              false         true
03       2010    02       false             false         false

转为 JOIN ,因此只有上表中 join 列为 false 的行才会在 t2.* 字段中返回空值。

我的意思是这种条件不能变成纯连接,因为左连接的工作方式(即使连接条件为假仍然返回记录),就像内连接一样条件(可以变成纯连接,反之亦然)。

关于提议的查询 - 我的建议是不要使用它。

作为一般规则,使用 OR 的查询的性能比使用 AND 条件的查询差(OR 扩展结果集;AND 限制)。这既适用于连接条件,也适用于 where 条件。

您的查询会比在 where 条件中具有 t1.year = '2009' 的查询执行得更差,因为在后面的情况下,可以使用索引(如果存在),因为如果您像您一样加入,那么您基本上是人为地将一个表中的记录与另一个表中的记录连接起来,以便您的 where 条件仅过滤您需要的记录。
只从 t1 中获取以 2009 年开头的记录应该会更有效(假设有一个关于年份的索引,并且选择性足够高,这将在 WHERE 条件下发生)。

这两个有关性能的建议都可以而且应该通过检查查询计划来验证。

最后,该查询有点晦涩 - 在 t1.year = '2009' 的情况下,连接变成笛卡尔积(稍后被过滤掉)并不是很明显。因此,如果假设简单的 LEFT JOIN 与 where t1.year = 2009 AND t2.store is null 执行效果更好且更具可读性,我就不会使用此查询。

If you go by the definition of the LEFT JOIN that

  • for each row from the left side table it will return a matching row from the right side table if one exists
  • if no rows exist on the right side it will still return a row from a left side table with all columns from the right side table set to NULL

Now realize that the exist is equivalent to ON condition evaluates to true and it all makes sense.

For row where t1.year is 2010 the on expression evaluates to false (x AND 2010=2009 is false for all rows where t1.year = 2010), but since it is a left join the row from the left table will still be returned (according to the definition).

Therefore this kind of condition can only be written as where condition and not join condition.

(in general case it need not be table but a select expression)

EDIT:
As Erwin interestingly pointed out, the where condition can be turned into JOIN with

select t1.*
from t1
left outer join t2
on t1.store = t2.store
or t1.year = '2009'
where t2.store is null

because:

t1.store t1.year t2.store t1.store=t2.store t1.year=2009  join(OR)
01       2009    02       false             true          true
02       2009    02       true              true          true
03       2009    02       false             true          true
01       2010    02       false             false         false
02       2010    02       true              false         true
03       2010    02       false             false         false

so only the rows in which join column in the above table is false will return nulls in t2.* fields.

What I meant to say is that this kind of condition can not be turned into a pure join because the way left join works (still returns records even when join condition is false), as is possible with inner join conditions (can be turned into pure joins and vice versa).

Regarding the proposed query - my advice would be not to use it.

Queries with ORs perform worse then queries with AND conditions as a general rule (OR expand the result set; ANDs restrict). This applies both to join conditions and to where conditions.

Your query would perform worse then query that has t1.year = '2009' in the where condition, because in later case it would be possible to use an index (if one exists), because if you join like you do, you are basically artificially joining records from one table with records from another just so that your where condition filter only the records you need.
It should be more effective to only get the records from t1 that have 2009 to begin with (assuming there's an index on year and selectivity is high enough that will happen with the WHERE condition).

Both of these suggestions regarding performance can and should be verified by inspecting the query plan.

Finally, the query is a bit obscure - it is not immediately obvious that in case of t1.year = '2009' the join becomes Cartesian product (which is filtered out later). So, if the assumption that simple LEFT JOIN with where t1.year = 2009 AND t2.store is null perform better AND is more readable, I would not use this query.

但可醉心 2024-09-04 15:31:01

您的第二个查询返回第一个表中的所有结果,除了连接中匹配的结果(其结果可以在第一个查询中看到)。如果您去掉“其中 t2.store 是null”子句,并将“t2.Store”添加到所选列。您应该看到的结果是:

01 2009 null
02 2009 02
03 2009 null
01 2010 null
02 2010 null
03 2010 null

正如您所看到的,如果您随后过滤到 t2.store 为 null 的行,那么您所做的就是减去第二行(唯一与 join 子句匹配的行。)

Your second query returns all the results from the first table, except the ones matched in the join (whose results can be seen in the first query.) This might be easier for you to visualize if you take away the "where t2.store is null" clause, and add "t2.Store" to the selected columns. The results you should see then are:

01 2009 null
02 2009 02
03 2009 null
01 2010 null
02 2010 null
03 2010 null

As you can see, if you then filter to just the rows where t2.store is null, all you're doing is subtracting the second row (the only one that matched the join clause.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文