为什么在涉及唯一约束的情况下 Postgres 处理 NULL 的方式不一致？

发布于 2024-11-04 06:56:50 字数 1697 浏览 5 评论 0原文

我最近注意到 Postgres 在具有唯一约束的列中处理 NULL 的方式不一致。

考虑一个人员表：

create table People (
   pid  int  not null,
   name text not null,
   SSN  text unique,
 primary key (pid)
);

SSN 列应保持唯一。我们可以检查：

-- Add a row.
insert into People(pid, name, SSN)
values(0, 'Bob', '123');

-- Test the unique constraint.
insert into People(pid, name, SSN)
values(1, 'Carol', '123');

第二次插入失败，因为它违反了 SSN 的唯一约束。到目前为止，一切都很好。但让我们尝试一下 NULL：

insert into People(pid, name, SSN)
values(1, 'Carol', null);

这有效。

select *    
from People;

0;"Bob";"123"
1;"Carol";"<NULL>"

唯一的列将采用空值。有趣的。 Postgres 如何断言 null 在任何方面都是唯一的，或者在这方面不是唯一的？

我想知道是否可以在唯一列中添加两行为空的行。

insert into People(pid, name, SSN)
values(2, 'Ted', null);

select *    
from People;

0;"Bob";"123"
1;"Carol";"<NULL>"
2;"Ted";"<NULL>"

是的，我可以。现在，尽管 SSN 应该是唯一的，但 SSN 列中有两行为 NULL。

Postgres 文档说，为了唯一约束，空值不被视为相等。

好的。我明白这一点。这是空处理中的一个很好的微妙之处：通过考虑唯一约束列中的所有 NULL 是不相交的，我们延迟了唯一约束的强制执行，直到有一个实际的非空值作为强制执行的基础。

太酷了。但这就是 Postgres 让我迷失的地方。如果唯一约束列中的所有 NULL 都不相等，如文档所述，那么我们应该在 select unique 查询中看到所有空值。

select distinct SSN
from People;

"<NULL>"
"123"

没有。那里只有一个空值。 Postgres 似乎有这个错误。但我想知道：还有其他解释吗？

编辑：

Postgres 文档确实指定“在此比较中空值被视为相等”。在有关 SELECT DISTINCT 的部分。虽然我不理解这个概念，但我很高兴文档中对此进行了详细说明。

原文

I recently noticed an inconsistency in how Postgres handles NULLs in columns with a unique constraint.

Consider a table of people:

create table People (
   pid  int  not null,
   name text not null,
   SSN  text unique,
 primary key (pid)
);

The SSN column should be kept unique. We can check that:

-- Add a row.
insert into People(pid, name, SSN)
values(0, 'Bob', '123');

-- Test the unique constraint.
insert into People(pid, name, SSN)
values(1, 'Carol', '123');

The second insert fails because it violates the unique constraint on SSN. So far, so good. But let's try a NULL:

insert into People(pid, name, SSN)
values(1, 'Carol', null);

That works.

select *    
from People;

0;"Bob";"123"
1;"Carol";"<NULL>"

A unique column will take a null. Interesting. How can Postgres assert that null is in any way unique, or not unique for that matter?

I wonder if I can add two rows with null in a unique column.

insert into People(pid, name, SSN)
values(2, 'Ted', null);

select *    
from People;

0;"Bob";"123"
1;"Carol";"<NULL>"
2;"Ted";"<NULL>"

Yes I can. Now there are two rows with NULL in the SSN column even though SSN is supposed to be unique.

The Postgres documentation says, For the purpose of a unique constraint, null values are not considered equal.

Okay. I can see the point of this. It's a nice subtlety in null-handling: By considering all NULLs in a unique-constrained column to be disjoint, we delay the unique constraint enforcement until there is an actual non-null value on which to base that enforcement.

That's pretty cool. But here's where Postgres loses me. If all NULLs in a unique-constrained column are not equal, as the documentation says, then we should see all of the nulls in a select distinct query.

select distinct SSN
from People;

"<NULL>"
"123"

Nope. There's only a single null there. It seems like Postgres has this wrong. But I wonder: Is there another explanation?

Edit:

The Postgres docs do specify that "Null values are considered equal in this comparison." in the section on SELECT DISTINCT. While I do not understand that notion, I'm glad it's spelled out in the docs.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

深海少女心 2024-11-11 06:56:50

处理 null 时，这样说几乎总是错误的：

“空值在这里表现得像某某，*所以它们应该表现得像
这里如此这般”

这里< /a> 是一篇从 postgres 角度讨论该主题的优秀文章，简要概括为根据上下文对空值进行不同的处理，并且不要错误地对它们做出任何假设。

回复收藏 0 原文

空袭的梦i 2024-11-11 06:56:50

底线是，PostgreSQL 对 null 执行的操作是因为 SQL 标准是这么说的。

空值显然很棘手，并且可以用多种方式解释（未知值、缺失值等），因此在最初编写 SQL 标准时，作者必须在某些地方进行一些调用。我想说时间已经证明它们或多或少是正确的，但这并不意味着不可能有另一种数据库语言可以稍微（或完全）不同地处理未知和不存在的值。但 PostgreSQL 实现了 SQL，所以就是这样。

正如在另一个答案中已经提到的那样，杰夫·戴维斯（Jeff Davis）写了一些关于处理空值的好文章和演示文稿。

回复收藏 0 原文

三生路 2024-11-11 06:56:50

NULL 被认为是唯一的，因为 NULL 并不表示值不存在。列中的 NULL 是未知值。当您比较两个未知数时，您不知道它们是否相等，因为您不知道它们是什么。

想象一下，你有两个盒子，分别标记为 A 和 B。如果你不打开盒子，也看不到里面的情况，你永远不知道里面装的是什么。如果你被问到“这两个盒子里的东西一样吗？”你只能回答“我不知道”。

在这种情况下，PostgreSQL 也会做同样的事情。当被要求比较两个 NULL 时，它会说“我不知道”。这与 SQL 数据库中 NULL 的疯狂语义有很大关系。另一个答案中链接的文章是理解 NULL 行为的一个很好的起点。请注意：它因供应商而异。

回复收藏 0 原文

惜醉颜 2024-11-11 06:56:50

唯一索引中存在多个 NULL 值是可以的，因为 x = NULL 对于所有 x 都是 false，特别是当 x 本身为 NULL 时。您还会在 WHERE 子句中遇到这种行为，您必须说 WHERE x IS NULL 和 WHERE x IS NOT NULL 而不是 WHERE x = NULL< /code> 和 WHERE x <>空。