NOT IN 子句中的 NULL 值

发布于 2024-07-07 04:34:48 字数 426 浏览 7 评论 0 原文

当我得到不同的记录计数时,我认为相同的查询(一个使用 not in where 约束,另一个使用 left join。 not in 约束中的表有一个空值(错误数据),这导致该查询返回 0 条记录。 我有点明白为什么,但我需要一些帮助才能完全掌握这个概念。

简单来说,为什么查询A有结果而B没有返回结果?

A: select 'true' where 3 in (1, 2, 3, null)
B: select 'true' where 3 not in (1, 2, null)

这是在 SQL Server 2005 上进行的。我还发现调用 set ansi_nulls off 会导致 B 返回结果。

This issue came up when I got different records counts for what I thought were identical queries one using a not in where constraint and the other a left join. The table in the not in constraint had one null value (bad data) which caused that query to return a count of 0 records. I sort of understand why but I could use some help fully grasping the concept.

To state it simply, why does query A return a result but B doesn't?

A: select 'true' where 3 in (1, 2, 3, null)
B: select 'true' where 3 not in (1, 2, null)

This was on SQL Server 2005. I also found that calling set ansi_nulls off causes B to return a result.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

深空失忆 2024-07-14 04:34:48

查询 A 与以下内容相同:

select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null

由于 3 = 3 为 true,因此您会得到结果。

查询 B 与以下内容相同:

select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null

ansi_nulls 打开时,3 <> null 是 UNKNOWN,因此谓词的计算结果为 UNKNOWN,并且您不会获得任何行。

ansi_nulls 关闭时,3 <> null 为 true,因此谓词的计算结果为 true,并且您会得到一行。

Query A is the same as:

select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null

Since 3 = 3 is true, you get a result.

Query B is the same as:

select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null

When ansi_nulls is on, 3 <> null is UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.

When ansi_nulls is off, 3 <> null is true, so the predicate evaluates to true, and you get a row.

高跟鞋的旋律 2024-07-14 04:34:48

与未知值比较时,NOT IN 返回 0 条记录

由于 NULL 是未知值,因此包含 NULLNOT IN 查询可能值列表中的 code> 或 NULL 将始终返回 0 记录,因为无法确保 NULL 值是不是正在测试的值。

NOT IN returns 0 records when compared against an unknown value

Since NULL is an unknown, a NOT IN query containing a NULL or NULLs in the list of possible values will always return 0 records since there is no way to be sure that the NULL value is not the value being tested.

傲娇萝莉攻 2024-07-14 04:34:48

每当您使用 NULL 时,您实际上是在处理三值逻辑。

您的第一个查询返回的结果是 WHERE 子句的计算结果:

    3 = 1 or 3 = 2 or 3 = 3 or 3 = null
which is:
    FALSE or FALSE or TRUE or UNKNOWN
which evaluates to 
    TRUE

第二个查询:

    3 <> 1 and 3 <> 2 and 3 <> null
which evaluates to:
    TRUE and TRUE and UNKNOWN
which evaluates to:
    UNKNOWN

UNKNOWN 与 FALSE 不同
您可以通过调用轻松测试它:

select 'true' where 3 <> null
select 'true' where not (3 <> null)

两个查询都不会给您任何结果

如果 UNKNOWN 与 FALSE 相同,则假设第一个查询将给您 FALSE,第二个查询必须计算为 TRUE,因为它与 NOT 相同(错误)。
事实并非如此。

SqlServerCentral 上有一篇关于此主题的非常好的文章

NULL 和三值逻辑的整个问题一开始可能会有点令人困惑,但为了在 TSQL 中编写正确的查询,必须理解它。

我推荐的另一篇文章是 SQL 聚合函数和 NULL

Whenever you use NULL you are really dealing with a Three-Valued logic.

Your first query returns results as the WHERE clause evaluates to:

    3 = 1 or 3 = 2 or 3 = 3 or 3 = null
which is:
    FALSE or FALSE or TRUE or UNKNOWN
which evaluates to 
    TRUE

The second one:

    3 <> 1 and 3 <> 2 and 3 <> null
which evaluates to:
    TRUE and TRUE and UNKNOWN
which evaluates to:
    UNKNOWN

The UNKNOWN is not the same as FALSE
you can easily test it by calling:

select 'true' where 3 <> null
select 'true' where not (3 <> null)

Both queries will give you no results

If the UNKNOWN was the same as FALSE then assuming that the first query would give you FALSE the second would have to evaluate to TRUE as it would have been the same as NOT(FALSE).
That is not the case.

There is a very good article on this subject on SqlServerCentral.

The whole issue of NULLs and Three-Valued Logic can be a bit confusing at first but it is essential to understand in order to write correct queries in TSQL

Another article I would recommend is SQL Aggregate Functions and NULL.

梦罢 2024-07-14 04:34:48

与 null 的比较是未定义的,除非您使用 IS NULL。

因此,当将 3 与 NULL 进行比较时(查询 A),它返回 undefined。

即 SELECT 'true' where 3 in (1,2,null)

SELECT 'true' where 3 not in (1,2,null)

将产生相同的结果,因为 NOT (UNDEFINED) 仍然未定义,但不是 TRUE

Compare to null is undefined, unless you use IS NULL.

So, when comparing 3 to NULL (query A), it returns undefined.

I.e. SELECT 'true' where 3 in (1,2,null)
and
SELECT 'true' where 3 not in (1,2,null)

will produce the same result, as NOT (UNDEFINED) is still undefined, but not TRUE

残疾 2024-07-14 04:34:48

SQL 使用三值逻辑来表示真值。 IN 查询产生预期结果:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE col IN (NULL, 1)
-- returns first row

但是添加 NOT 不会反转结果:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE NOT col IN (NULL, 1)
-- returns zero rows

这是因为上面的查询等效于以下内容:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE NOT (col = NULL OR col = 1)

以下是 where 子句的计算方式:

| col | col = NULL⁽¹⁾  | col = 1 | col = NULL OR col = 1 | NOT (col = NULL OR col = 1) |
|-----|----------------|---------|-----------------------|-----------------------------|
| 1   | UNKNOWN        | TRUE    | TRUE                  | FALSE                       |
| 2   | UNKNOWN        | FALSE   | UNKNOWN⁽²⁾            | UNKNOWN⁽³⁾                  |

注意:

  1. 涉及 NULL 的比较产生 UNKNOWN
  2. OR 表达式其中没有一个操作数为 TRUE 并且至少有一个操作数为 UNKNOWN 则产生 UNKNOWN (ref)
  3. UNKNOWNNOT 产生 未知参考

您可以将上面的示例扩展到两个以上的值(例如 NULL、1 和 2),但结果将是相同的:如果其中一个值是 NULL,则没有行匹配。

SQL uses three-valued logic for truth values. The IN query produces the expected result:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE col IN (NULL, 1)
-- returns first row

But adding a NOT does not invert the results:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE NOT col IN (NULL, 1)
-- returns zero rows

This is because the above query is equivalent of the following:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE NOT (col = NULL OR col = 1)

Here is how the where clause is evaluated:

| col | col = NULL⁽¹⁾  | col = 1 | col = NULL OR col = 1 | NOT (col = NULL OR col = 1) |
|-----|----------------|---------|-----------------------|-----------------------------|
| 1   | UNKNOWN        | TRUE    | TRUE                  | FALSE                       |
| 2   | UNKNOWN        | FALSE   | UNKNOWN⁽²⁾            | UNKNOWN⁽³⁾                  |

Notice that:

  1. The comparison involving NULL yields UNKNOWN
  2. The OR expression where none of the operands are TRUE and at least one operand is UNKNOWN yields UNKNOWN (ref)
  3. The NOT of UNKNOWN yields UNKNOWN (ref)

You can extend the above example to more than two values (e.g. NULL, 1 and 2) but the result will be same: if one of the values is NULL then no row will match.

淡写薰衣草的香 2024-07-14 04:34:48

如果您想使用 NOT IN 过滤包含 NULL 的子查询,只需检查是否为 null

SELECT blah FROM t WHERE blah NOT IN
        (SELECT someotherBlah FROM t2 WHERE someotherBlah IS NOT NULL )

IF you want to filter with NOT IN for a subquery containg NULLs justcheck for not null

SELECT blah FROM t WHERE blah NOT IN
        (SELECT someotherBlah FROM t2 WHERE someotherBlah IS NOT NULL )
德意的啸 2024-07-14 04:34:48

在撰写本文时这个问题的标题是

SQL NOT IN 约束和 NULL 值

从问题的文本来看,问题似乎发生在 SQL DML SELECT 查询中,而不是 SQL DDL CONSTRAINT 中。

然而,特别是考虑到标题的措辞,我想指出,这里所做的一些陈述可能是误导性的陈述,这些陈述类似于(释义)

当谓词计算结果为 UNKNOWN 时,您不会获得任何行。

尽管 SQL DML 就是这种情况,但在考虑约束时,效果是不同的。

考虑这个非常简单的表,其中两个约束直接取自问题中的谓词(并在 @Brannon 的出色答案中解决):

DECLARE @T TABLE 
(
 true CHAR(4) DEFAULT 'true' NOT NULL, 
 CHECK ( 3 IN (1, 2, 3, NULL )), 
 CHECK ( 3 NOT IN (1, 2, NULL ))
);

INSERT INTO @T VALUES ('true');

SELECT COUNT(*) AS tally FROM @T;

根据 @Brannon 的答案,第一个约束(使用 IN)计算为TRUE 且第二个约束(使用 NOT IN)计算结果为 UNKNOWN。 但是,插入成功了! 因此,在这种情况下,严格来说“你没有得到任何行”并不正确,因为我们确实插入了一行。

就 SQL-92 标准而言,上述效果确实是正确的。 比较和对比 SQL-92 规范中的以下部分

7.6 where 子句

结果是 T 的那些行的表
搜索条件的结果为真。

4.10 完整性约束

当且仅当指定的
对于表格的任何行,搜索条件都不为假。

换句话说:

在 SQL DML 中,当 WHERE 计算结果为 UNKNOWN 时,行将从结果中删除,因为它满足条件“为真”。

在 SQL DDL(即约束)中,当行的计算结果为 UNKNOWN 时,它们不会从结果中删除,因为它确实满足“不为假”条件。

尽管 SQL DML 和 SQL DDL 中的效果分别看起来可能是矛盾的,但通过允许 UNKNOWN 结果满足约束(更准确地说,允许它们不会无法满足约束)来给予 UNKNOWN 结果“无罪”是有实际原因的。 :如果没有这种行为,每个约束都必须显式处理空值,从语言设计的角度来看,这将非常不令人满意(更不用说,这对编码人员来说是一种痛苦!)

ps如果您发现遵循这样的逻辑具有挑战性:“正如我要写的那样,“未知不会无法满足约束”,然后考虑您可以通过避免 SQL DDL 中的可空列以及 SQL DML 中产生空值的任何内容(例如外连接)来省去所有这些!

The title of this question at the time of writing is

SQL NOT IN constraint and NULL values

From the text of the question it appears that the problem was occurring in a SQL DML SELECT query, rather than a SQL DDL CONSTRAINT.

However, especially given the wording of the title, I want to point out that some statements made here are potentially misleading statements, those along the lines of (paraphrasing)

When the predicate evaluates to UNKNOWN you don't get any rows.

Although this is the case for SQL DML, when considering constraints the effect is different.

Consider this very simple table with two constraints taken directly from the predicates in the question (and addressed in an excellent answer by @Brannon):

DECLARE @T TABLE 
(
 true CHAR(4) DEFAULT 'true' NOT NULL, 
 CHECK ( 3 IN (1, 2, 3, NULL )), 
 CHECK ( 3 NOT IN (1, 2, NULL ))
);

INSERT INTO @T VALUES ('true');

SELECT COUNT(*) AS tally FROM @T;

As per @Brannon's answer, the first constraint (using IN) evaluates to TRUE and the second constraint (using NOT IN) evaluates to UNKNOWN. However, the insert succeeds! Therefore, in this case it is not strictly correct to say, "you don't get any rows" because we have indeed got a row inserted as a result.

The above effect is indeed the correct one as regards the SQL-92 Standard. Compare and contrast the following section from the SQL-92 spec

7.6 where clause

The result of the is a table of those rows of T for
which the result of the search condition is true.

4.10 Integrity constraints

A table check constraint is satisfied if and only if the specified
search condition is not false for any row of a table.

In other words:

In SQL DML, rows are removed from the result when the WHERE evaluates to UNKNOWN because it does not satisfy the condition "is true".

In SQL DDL (i.e. constraints), rows are not removed from the result when they evaluate to UNKNOWN because it does satisfy the condition "is not false".

Although the effects in SQL DML and SQL DDL respectively may seem contradictory, there is practical reason for giving UNKNOWN results the 'benefit of the doubt' by allowing them to satisfy a constraint (more correctly, allowing them to not fail to satisfy a constraint): without this behaviour, every constraints would have to explicitly handle nulls and that would be very unsatisfactory from a language design perspective (not to mention, a right pain for coders!)

p.s. if you are finding it as challenging to follow such logic as "unknown does not fail to satisfy a constraint" as I am to write it, then consider you can dispense with all this simply by avoiding nullable columns in SQL DDL and anything in SQL DML that produces nulls (e.g. outer joins)!

梦在深巷 2024-07-14 04:34:48

在 A 中,测试 3 与集合中每个成员的相等性,产生(FALSE、FALSE、TRUE、UNKNOWN)。 由于元素之一为 TRUE,因此条件为 TRUE。 (也有可能这里发生了一些短路,所以它实际上在遇到第一个 TRUE 时就停止了,并且从不计算 3=NULL。)

在 B 中,我认为它正在将条件评估为 NOT (3 in (1 ,2,空))。 测试 3 是否与集合产生相等(FALSE、FALSE、UNKNOWN),其聚合为 UNKNOWN。 NOT ( UNKNOWN ) 产生 UNKNOWN。 因此,总体而言,情况的真实性是未知的,最终基本上被视为错误。

In A, 3 is tested for equality against each member of the set, yielding (FALSE, FALSE, TRUE, UNKNOWN). Since one of the elements is TRUE, the condition is TRUE. (It's also possible that some short-circuiting takes place here, so it actually stops as soon as it hits the first TRUE and never evaluates 3=NULL.)

In B, I think it is evaluating the condition as NOT (3 in (1,2,null)). Testing 3 for equality against the set yields (FALSE, FALSE, UNKNOWN), which is aggregated to UNKNOWN. NOT ( UNKNOWN ) yields UNKNOWN. So overall the truth of the condition is unknown, which at the end is essentially treated as FALSE.

仅冇旳回忆 2024-07-14 04:34:48

Null 表示数据不存在,也就是说它是未知的,而不是没有任何数据值。 对于有编程背景的人来说很容易混淆这一点,因为在 C 类型语言中使用指针时 null 确实没什么。

因此,在第一种情况下,3 确实在 (1,2,3,null) 集合中,因此返回 true

在第二种情况下,您可以将其减少为

select 'true' 其中 3 不在 (null) 中

因此不会返回任何内容,因为解析器对您要与之比较的集合一无所知 - 它不是一个空集合,而是一个未知集合。 使用 (1, 2, null) 没有帮助,因为 (1,2) 集显然是假的,但随后你将其与未知相比较,这是未知的。

Null signifies and absence of data, that is it is unknown, not a data value of nothing. It's very easy for people from a programming background to confuse this because in C type languages when using pointers null is indeed nothing.

Hence in the first case 3 is indeed in the set of (1,2,3,null) so true is returned

In the second however you can reduce it to

select 'true' where 3 not in (null)

So nothing is returned because the parser knows nothing about the set to which you are comparing it - it's not an empty set but an unknown set. Using (1, 2, null) doesn't help because the (1,2) set is obviously false, but then you're and'ing that against unknown, which is unknown.

你的他你的她 2024-07-14 04:34:48

从这里的答案可以得出结论,NOT IN (subquery) 不能正确处理空值,应该避免使用 NOT EXISTS。 然而,这样的结论可能还为时过早。 在下面的场景中,出自 Chris Date(数据库编程与设计,第 2 卷第 9 期,1989 年 9 月),正确处理空值并返回正确结果的是 NOT IN,而不是 不存在

考虑一个表 sp 来表示已知供应零件 (pno) 数量 (qtysno) >)。 该表当前包含以下值:

      VALUES ('S1', 'P1', NULL), 
             ('S2', 'P1', 200),
             ('S3', 'P1', 1000)

请注意,数量可为空,即能够记录已知供应商供应零件的事实,即使不知道其数量。

任务是查找已知供应部件编号“P1”但数量不超过 1000 的供应商。

以下使用 NOT IN 仅正确识别供应商“S2”:

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1', NULL ), 
                       ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT DISTINCT spx.sno
  FROM sp spx
 WHERE spx.pno = 'P1'
       AND 1000 NOT IN (
                        SELECT spy.qty
                          FROM sp spy
                         WHERE spy.sno = spx.sno
                               AND spy.pno = 'P1'
                       );

但是,以下查询使用相同的一般结构,但具有 NOT EXISTS 但错误地将供应商“S1”包含在结果中(即数量为空):

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1', NULL ), 
                       ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT DISTINCT spx.sno
  FROM sp spx
 WHERE spx.pno = 'P1'
       AND NOT EXISTS (
                       SELECT *
                         FROM sp spy
                        WHERE spy.sno = spx.sno
                              AND spy.pno = 'P1'
                              AND spy.qty = 1000
                      );

因此 NOT EXISTS 不是银弹它可能已经出现了!

当然,问题的根源是空值的存在,因此“真正”的解决方案是消除这些空值。

这可以使用两个表来实现(以及其他可能的设计):

  • sp 已知供应零件的供应商
  • spq 已知供应已知数量零件的供应商,

注意到可能应该有一个外国供应商其中 spq 引用 sp 的键约束。

然后可以使用“减”关系运算符(即标准 SQL 中的 EXCEPT 关键字)获得结果,例如

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1' ), 
                       ( 'S2', 'P1' ),
                       ( 'S3', 'P1' ) )
              AS T ( sno, pno )
     ),
     spq AS 
     ( SELECT * 
         FROM ( VALUES ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT sno
  FROM spq
 WHERE pno = 'P1'
EXCEPT 
SELECT sno
  FROM spq
 WHERE pno = 'P1'
       AND qty = 1000;

It may be concluded from answers here that NOT IN (subquery) doesn't handle nulls correctly and should be avoided in favour of NOT EXISTS. However, such a conclusion may be premature. In the following scenario, credited to Chris Date (Database Programming and Design, Vol 2 No 9, September 1989), it is NOT IN that handles nulls correctly and returns the correct result, rather than NOT EXISTS.

Consider a table sp to represent suppliers (sno) who are known to supply parts (pno) in quantity (qty). The table currently holds the following values:

      VALUES ('S1', 'P1', NULL), 
             ('S2', 'P1', 200),
             ('S3', 'P1', 1000)

Note that quantity is nullable i.e. to be able to record the fact a supplier is known to supply parts even if it is not known in what quantity.

The task is to find the suppliers who are known supply part number 'P1' but not in quantities of 1000.

The following uses NOT IN to correctly identify supplier 'S2' only:

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1', NULL ), 
                       ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT DISTINCT spx.sno
  FROM sp spx
 WHERE spx.pno = 'P1'
       AND 1000 NOT IN (
                        SELECT spy.qty
                          FROM sp spy
                         WHERE spy.sno = spx.sno
                               AND spy.pno = 'P1'
                       );

However, the below query uses the same general structure but with NOT EXISTS but incorrectly includes supplier 'S1' in the result (i.e. for which the quantity is null):

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1', NULL ), 
                       ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT DISTINCT spx.sno
  FROM sp spx
 WHERE spx.pno = 'P1'
       AND NOT EXISTS (
                       SELECT *
                         FROM sp spy
                        WHERE spy.sno = spx.sno
                              AND spy.pno = 'P1'
                              AND spy.qty = 1000
                      );

So NOT EXISTS is not the silver bullet it may have appeared!

Of course, source of the problem is the presence of nulls, therefore the 'real' solution is to eliminate those nulls.

This can be achieved (among other possible designs) using two tables:

  • sp suppliers known to supply parts
  • spq suppliers known to supply parts in known quantities

noting there should probably be a foreign key constraint where spq references sp.

The result can then be obtained using the 'minus' relational operator (being the EXCEPT keyword in Standard SQL) e.g.

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1' ), 
                       ( 'S2', 'P1' ),
                       ( 'S3', 'P1' ) )
              AS T ( sno, pno )
     ),
     spq AS 
     ( SELECT * 
         FROM ( VALUES ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT sno
  FROM spq
 WHERE pno = 'P1'
EXCEPT 
SELECT sno
  FROM spq
 WHERE pno = 'P1'
       AND qty = 1000;
梦过后 2024-07-14 04:34:48

这是给男孩的:

select party_code 
from abc as a
where party_code not in (select party_code 
                         from xyz 
                         where party_code = a.party_code);

无论 ansi 设置如何,这都有效

this is for Boy:

select party_code 
from abc as a
where party_code not in (select party_code 
                         from xyz 
                         where party_code = a.party_code);

this works regardless of ansi settings

断肠人 2024-07-14 04:34:48

这也可能有助于了解 join、exists 和 in 之间的逻辑差异
http://weblogs.sqlteam.com/mladenp/archive/ 2007/05/18/60210.aspx

also this might be of use to know the logical difference between join, exists and in
http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文