从一列而不是另一列中选择所有值的有效方法

发布于 2024-12-25 03:04:01 字数 249 浏览 1 评论 0原文

我需要从 colA 返回不在 mytable 的 colB 中的所有值。我正在使用：

SELECT DISTINCT(colA) FROM mytable WHERE colA NOT IN (SELECT colB FROM mytable)

它正在工作，但是查询需要很长时间才能完成。

有没有更有效的方法来做到这一点？

原文

I need to return all values from colA that are not in colB from mytable. I am using:

SELECT DISTINCT(colA) FROM mytable WHERE colA NOT IN (SELECT colB FROM mytable)

It is working however the query is taking an excessively long time to complete.

Is there a more efficient way to do this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦中的蝴蝶 2025-01-01 03:04:01

在标准 SQL 中，DISTINCT colA 中没有括号。 DISTINCT 不是一个函数。

SELECT DISTINCT colA
FROM   mytable
WHERE  colA NOT IN (SELECT DISTINCT colB FROM mytable);

还将 DISTINCT 添加到子选择中。对于许多重复项，它可以加快查询速度。但是，通常有比 NOT IN 更有效的查询技术：

选择其他表中不存在的行

速度的决定因素是索引。您需要在 colA 和 colB 上建立索引才能加快此查询的速度。

CTE 可能会更快，具体取决于您的 DBMS。我还演示了 LEFT JOIN 作为排除 valB 中的值的替代方法，以及使用 GROUP BY 获取不同值的替代方法：

WITH x AS (SELECT colB FROM mytable GROUP BY colB)
SELECT m.colA
FROM   mytable m
LEFT   JOIN x ON x.colB = m.colA
WHERE  x.colB IS NULL
GROUP  BY m.colA;

或者，简化进一步，并使用普通子查询（可能是最快的）：

SELECT DISTINCT m.colA
FROM   mytable m
LEFT   JOIN mytable x ON x.colB = m.colA
WHERE  x.colB IS NULL;

In standard SQL there are no parentheses in DISTINCT colA. DISTINCT is not a function.

SELECT DISTINCT colA
FROM   mytable
WHERE  colA NOT IN (SELECT DISTINCT colB FROM mytable);

Added DISTINCT to the sub-select as well. For many duplicates it can speed up the query. But, typically, there are more efficient query techniques than NOT IN for this:

Select rows which are not present in other table

The deciding factor for speed will be indexes. You need indexes on colA and colB for this query to be fast.

A CTE might be faster, depending on your DBMS. I additionally demonstrate LEFT JOIN as alternative to exclude the values in valB, and an alternative way to get distinct values with GROUP BY:

WITH x AS (SELECT colB FROM mytable GROUP BY colB)
SELECT m.colA
FROM   mytable m
LEFT   JOIN x ON x.colB = m.colA
WHERE  x.colB IS NULL
GROUP  BY m.colA;

Or, simplified further, and with a plain subquery (probably fastest):

SELECT DISTINCT m.colA
FROM   mytable m
LEFT   JOIN mytable x ON x.colB = m.colA
WHERE  x.colB IS NULL;

回复收藏 0 原文

鱼忆七猫命九 2025-01-01 03:04:01

您可以使用 exists：

select distinct
    colA
from
    mytable m1
where
    not exists (select 1 from mytable m2 where m2.colB = m1.colA)

exists 执行半连接来快速匹配值。 not in 完成整个结果集，然后对其执行 or 操作。对于表中的值，exists 通常更快。

You can use exists:

select distinct
    colA
from
    mytable m1
where
    not exists (select 1 from mytable m2 where m2.colB = m1.colA)

exists does a semi-join to quickly match the values. not in completes the entire result set and then does an or on it. exists is typically faster for values in tables.

回复收藏 0 原文