Null 和 IN() 提供意想不到的结果
这看起来很基本,我很惊讶没有更好的词。我有两个表,我们称它们为 albums
和 artists
CREATE TABLE `albums` (
`album_id` bigint(20) NOT NULL AUTO_INCREMENT,
`artist_id` bigint(20) DEFAULT NULL,
`name` varchar(200) NOT NULL,
PRIMARY KEY (`album_id`)
)
CREATE TABLE `artists` (
`artist_id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(250) NOT NULL,
PRIMARY KEY (`artist_id`)
)
每个表中有几十万条记录。某些专辑行的 artist_id
为空,这是预期的。
但是,当我执行以下查询来查找没有专辑的艺术家时:
SELECT * FROM Artists WHERE Artist_id NOT IN (SELECT Artist_id FROM albums)
...查询返回零结果。我知道这不是真的。所以我尝试了这个:
SELECT * FROM Artists WHERE Artist_id NOT IN (SELECT Artist_id FROM albums WHERE Artist_id IS NOT NULL)
...我得到了几千行。我的问题是:为什么第一个查询似乎是根据任何数字 = NULL 的想法进行操作的?或者这是 NULL 对 IN() 语句产生的奇怪影响?我觉得这是我错过的一些基本的东西。我通常根本不在数据库表中使用 NULL。
This seems so basic, I'm flabbergasted for lack of a better word. I have two tables, let's call them albums
and artists
CREATE TABLE `albums` (
`album_id` bigint(20) NOT NULL AUTO_INCREMENT,
`artist_id` bigint(20) DEFAULT NULL,
`name` varchar(200) NOT NULL,
PRIMARY KEY (`album_id`)
)
CREATE TABLE `artists` (
`artist_id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(250) NOT NULL,
PRIMARY KEY (`artist_id`)
)
There are a few hundred thousand reconds in each table. Some of the album rows have a null artist_id
, this is expected.
However, when I perform the following query to find artists without albums:
SELECT * FROM artists WHERE artist_id NOT IN (SELECT artist_id FROM albums)
... the query returns zero results. I know that this is not true. So I tried this one:
SELECT * FROM artists WHERE artist_id NOT IN (SELECT artist_id FROM albums WHERE artist_id IS NOT NULL)
... and I get back a couple thousand rows. My question is: Why did the first query seem to operate on the idea that any number = NULL? Or is this an odd effect that NULL has on the IN()
statement? I feel like this is something basic that I've missed. I don't usually use NULL in my db tables at all.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这就是为什么
NOT EXISTS
在语义上是正确的逻辑:
NOT IN (x, y, NULL)
实际上是NOT (x OR y OR NULL)
实际上是(NOT x)AND(NOT y)AND(NOT NULL)
因此
NULL
使整个NOT IN
无效This is why
NOT EXISTS
is semantically correctLogic:
NOT IN (x, y, NULL)
is actuallyNOT (x OR y OR NULL)
is actually(NOT x) AND (NOT y) AND (NOT NULL)
So
NULL
invalidates the wholeNOT IN
快速回答 -
IN
语句是=a OR =b OR ...
的快捷方式。如果您在此列表中包含空值,那么我认为这会破坏该声明。您的第二个选择可能是更好的选择。或者使用连接也可能有效,并且效率更高。
Quick answer - the
IN
statement is a shortcut for=a OR =b OR ...
. If you include nulls in this list, then I think that is breaking the statement. Your second option is probably a better option.Or using a join might also work, and be more efficient.
它与 SQL NULL 的解释方式有关 - 您必须将它们视为未知值。
假设您有 Artist_id = 1
如果您运行以下命令:
您得到的不是“False”,而是“UNKNOWN”;
当您运行像您这样的查询时,仅返回评估为“TRUE”的值。
It has to do with the way SQL NULL's are interpreted - You have to think of them as UNKNOWN value.
Lets say you have artist_id = 1
If you run the following:
Rather than getting a 'False' - you get 'UNKNOWN';
When you run a query such as yours, only values evaluating to 'TRUE' are returned.