子查询中 SQL WHERE EXISTS 的有趣观察
我想知道是否有人可以解释为什么会发生以下情况:
我有我制作的下表。它有 2 列“A”和“B”:
A B
==========
1, 11
2, 12
3, 13
4, 14
5, 15
6, 16
7, 17
8, 18
9, 19
如果我运行以下查询:-
SELECT * FROM
(SELECT A,B FROM Table_1) T1
WHERE EXISTS
(SELECT 'X' FROM Table_1 WHERE A = 3)
我会得到整个表。我理解这一点,因为 EXISTS 子句检查它是否在语句中找到它所做的单行
但是,如果我运行以下查询,我只得到表的一部分部分
SELECT * FROM
(SELECT A,B FROM Table_1) T1
WHERE EXISTS
(SELECT 'X' FROM Table_1 WHERE T1.A=3 OR T1.A=4)
结果如下
A B
========
3, 13
4, 14
任何人都可以解释为什么会发生这种情况吗?
I'd like to know if someone can explain why the following occurs:
I have the following Table which I made up. It has 2 columns "A" and "B":
A B
==========
1, 11
2, 12
3, 13
4, 14
5, 15
6, 16
7, 17
8, 18
9, 19
If I run the following Query :-
SELECT * FROM
(SELECT A,B FROM Table_1) T1
WHERE EXISTS
(SELECT 'X' FROM Table_1 WHERE A = 3)
I get the entire table. This I understand cause the EXISTS clause checks if it finds a single row in a statement which it does
However, If I run the following query, I only get a portion of the table
SELECT * FROM
(SELECT A,B FROM Table_1) T1
WHERE EXISTS
(SELECT 'X' FROM Table_1 WHERE T1.A=3 OR T1.A=4)
The portional results are as follows
A B
========
3, 13
4, 14
Can anyone explain why this happens?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
让我们分解一下您实际要求数据库服务器执行的操作。
SELECT * FROM
- 获取所有字段。从现在开始我将忽略这一点,因为它并不重要。(SELECT A,B FROM Table_1) T1
- 从表中获取所有行,并将结果集命名为“T1”。WHERE EXISTS (SELECT 'X' FROM Table_1 WHERE A = 3)
- 选择与上述无关的表中的所有行。这种情况在表中每行发生一次 - 但总是做同样的事情,因为子查询中不使用T1
。如果表中的某一行 A=3(总是这种情况),则不做任何限制。否则,丢弃该特定的T1
(这里不会发生这种情况)。WHERE EXISTS (SELECT 'X' FROM Table_1 WHERE T1.A=3 OR T1.A=4)
- 这个很棘手。您正在对表中的所有行进行第二次选择,但受到以下条件的限制:T1.A=3 OR T1.A=4
。此条件基于T1
的结果 - 但这不是整个表,只是特定行。当您说 SELECT * FROM mytable WHERE mytable.A=3 时,您的意思并不是“如果某行具有 A=3,则选择 mytable 的所有行”,您的意思只是选择那些特定的行,其中确实如此。因此,子查询SELECT 'X' FROM Table_1 WHERE T1.A=3 OR T1.A=4
要么不包含任何行,要么包含Table_1
中的所有行,具体取决于哪个值位于T1.A
中。因为您使用EXISTS
,所以所有行都为 true,无行为 false。这就是为什么您会得到不同的结果 - 您的子查询在
Table_1
中每行执行一次。在第一种情况下,它始终包含一行。在第二种情况下,它要么不包含行,要么包含全部九行,具体取决于T1
中包含的Table_1
的特定行。Let's break down what you're actually asking the database server to do.
SELECT * FROM
- get all fields. I'm going to ignore this bit from now on, since it's not important.(SELECT A,B FROM Table_1) T1
- get all rows from the table, and name that result set "T1".WHERE EXISTS (SELECT 'X' FROM Table_1 WHERE A = 3)
- Select, INDEPENDENT of the above, all rows from the table. This happens once per row in the table - but always does the same thing, sinceT1
is not used in the subquery. If one of the rows of the table has A=3 (always the case), do no restriction. Otherwise, discard that particularT1
(this does not happen here).WHERE EXISTS (SELECT 'X' FROM Table_1 WHERE T1.A=3 OR T1.A=4)
- this one is tricky. You're making a second selection of all rows from the table, but you're restricting by this condition:T1.A=3 OR T1.A=4
. This condition is based on the result ofT1
- but that's not the whole table, just a particular row. When you saySELECT * FROM mytable WHERE mytable.A=3
, you don't mean "select all rows of mytable if some row has A=3", you only mean to select those particular rows where that is true. So, the subquerySELECT 'X' FROM Table_1 WHERE T1.A=3 OR T1.A=4
contains either no rows or all the rows inTable_1
, depending on which value is inT1.A
. Because you useEXISTS
, you get true for all rows and false for none.That's why you get different results - your subquery is performed once per row in
Table_1
. In the first case, it always contains one row. In the second case, it contains either no rows or all nine, depending on the particular row ofTable_1
contained inT1
.如果删除一些子查询可能会更容易。您的查询基本上可归结为以下内容:
查询 1:从
Table_1
中选择true
中的所有记录。这是因为Table_1
中有包含 3 或 4 的A
字段。查询 2:选择
Table_1
中的所有记录,其中A< /code> 等于 3 或
A
等于 4用 SQL 表达,您的查询可以简化为
和
It's probably easier if you remove some of the subqueries. Your queries essentially come down to the following:
Query 1: Select all records from
Table_1
wheretrue
. This is because there areA
fields inTable_1
that contain 3 or 4.Query 2: Select all records from
Table_1
whereA
equals 3 orA
equals 4Expressed in SQL, your queries can be simplified to
and
我相信它会检查 T1 的当前行来查看 A=3 或 A=4,类似于您写道:
I believe it would be examining the current row of T1 to see if A=3 or A=4, similar to if you wrote:
在 SQL 中总是有多种方法来编写相同的内容。例如,这
......可以重写为:
后者更简单,我认为没有理由更喜欢前者。相应地重写您的第一个查询,
我删除了相关名称
T1
因为它没有任何用途。子查询不引用其“外部”表表达式,因此不需要消除每个外观Table_1
的歧义。我想您明白这里发生了什么:如果
Table_1
中的一行或多行满足搜索条件A = 3
则返回整个表,否则返回空集回来了。虽然它是一个有效的查询,但它通常不是一个非常有用的结构。然而,对于您的第二个查询,至少需要一个相关名称,因为子查询确实引用了其外部表:
这在语义上再次与您的第二个查询等效。请注意,我已为子查询中
Table_1
的外观指定了相关名称T2
,但T2
并未出现在子查询的WHERE 中
子句。因为没有使用 T2,所以我们可以完全删除子查询(因此需要相关名称):值得指出的是,通常会利用子查询引用“外部”表表达式的能力作为“相关子查询”,即搜索条件(
WHERE
子句)涉及“内部”和“外部”表(您的第二个查询的搜索条件涉及“外部”表 仅有的)。使用常用零件和供应商数据库,下面是一个相关子查询的示例,用于实现<一个href="https://stackoverflow.com/questions/6449544/findout-duplicate-rows-in-a-table-while-inserting/6450663#6450663">半加入寻找供应商(
S< /code>) 提供 (
SP
) 至少一部分:请注意,子查询的搜索条件将“外部”表
SP
与“内部”表S
。此外,“外部”表中的投影SELECT SNO, SNAME
不需要包含相关名称S
,因为来自 ' “外部”表不在“内部”表的范围内。There are always multiple ways to write the same thing in SQL. For example this
...may be rewritten as this:
The latter is simpler and I see no reason to prefer the former. Rewriting your first query accordingly
I've removed the correlation name
T1
because it server no purpose. The subquery does not reference its 'outer' table expression so each appearanceTable_1
does not need to be disambiguated.I think you understand what is going on here: if one or more rows in
Table_1
satisfy the search conditionA = 3
then the entire table is returned, otherwise the empty set is returned. While it is a valid query it is not often a very useful construct.For your second query, however, at least one correlation name is required because the subquery does reference its outer table:
Again this is semantically equivalent to your second query. Note that I've given the apperance of
Table_1
in the subquery the correlation nameT2
butT2
does not appear in the subquery'sWHERE
clause. BecauseT2
is not being used, we can remove the subquery (hence the need for correlation names) entirely:It's worth pointing out that the ability for a subquery to reference an 'outer' table expression is usually exploited as a 'correlated subquery' i.e. the search condition (
WHERE
clause) involves both 'inner' and 'outer' tables (your second query's search condition involves the 'outer' table only).Using the usual parts and suppliers database, here is an example of a correlated subquery to implement a semijoin to find suppliers (
S
) who supply (SP
) at least one part:Note the subquery's search condition relates the 'outer' table
SP
to the 'inner' tableS
. Also, the projectionSELECT SNO, SNAME
in the 'outer' table does not require the inclusion of the correlation nameS
becauseSP
from the 'outer' table is not in scope for the 'inner' table.哇,酷。
那里有一些额外的 SQL。你真正在做的是:
那么发生了什么?
当您将 Table_1 的每一行读取为 T1 时,引擎会评估子选择,以决定 T1 中的行是否应包含在结果集中。
在第一行,T1.A 为 1,则 T1.A = 3 为 FALSE,T1.A = 4 为 FALSE,因此子选择失败。该行不包含在结果集中。
A = 2 时相同。不包括 A = 2 的行。
但对于接下来的两行(3 和 4 的 T1.A),
OR
的至少一侧计算结果为TRUE
并且子选择成功。所以这两行都包含在内。当 T1.A 达到 5 时,对于表的其余部分,子选择将失败。
Wow, cool.
You have a little extra SQL in there. What you're really doing is:
So what's happening?
As you read each row of Table_1 as T1, the engine evaluates the sub-select in order to decide whether the row from T1 should be included in the result set.
On the first row, where T1.A is 1, then T1.A = 3 is
FALSE
and T1.A = 4 isFALSE
, so the sub-select fails. That row is not included in the result set.Same when A = 2. The row with A = 2 is not included.
But for the next two rows (T1.A of 3 and 4) at least one side of the
OR
evaluates toTRUE
and the sub-select succeeds. So those two rows are included.When you hit T1.A of 5, and for the rest of the table, the sub-select fails.