SQL 选择其中一列的值在另一个条件列中相同的行

发布于 2024-08-04 00:19:02 字数 587 浏览 6 评论 0原文

我有一个如下所示的交叉引用表:

id  document_id  subject_id
1   8            21
2   5            17
3   5            76
4   7            88
5   9            17
6   9            76
7   2            76

它将文档与主题相匹配。文档可以是多个主题的成员。我想从该表中返回给定文档与给定集中的所有主题匹配的行。例如,给定主题集:

(17,76)

我只想返回与交叉引用表中某处(至少)该集中的所有主题相匹配的文档的行。给定上述集合的所需输出集将是:

id  document_id  subject_id
2   5            17
3   5            76
5   9            17
6   9            76

请注意,不会返回表的最后一行,因为该文档仅与所需主题之一匹配。

在 SQL 中查询此内容最简单、最有效的方法是什么?

I have a cross reference table that looks like this:

id  document_id  subject_id
1   8            21
2   5            17
3   5            76
4   7            88
5   9            17
6   9            76
7   2            76

It matches documents to subjects. Documents can be members of more than one subject. I want to return rows from this table where a given document matches all the subjects in a given set. For example, given the set of subjects:

(17,76)

I want to return only rows for documents which match all the subjects in that set (at least) somewhere in the cross reference table. The desired output set given the above set would be:

id  document_id  subject_id
2   5            17
3   5            76
5   9            17
6   9            76

Notice that the last row of the table is not returned because that document only matches one of the required subjects.

What is the simplest and most efficient way to query for this in SQL?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

守不住的情 2024-08-11 00:19:02

我假设该表的自然键是 document_id + subject_id,并且该 id 是代理项; IOW、document_id 和 subject_id 是唯一的。因此,我只是假装它不存在,并且自然键有一个唯一的约束。

让我们从显而易见的事情开始。

SELECT document_id, subject_id
  FROM document_subjects
 WHERE subject_id IN (17,76)

这会给你带来你想要的一切加上你不想要的东西。所以我们需要做的就是过滤掉其他的东西。 “其他内容”是具有不等于所需主题的计数的计数的行组。

SELECT document_id
  FROM document_subjects
 WHERE subject_id IN (17,76)
 GROUP BY document_id
HAVING COUNT(*) = 2

请注意,subject_id 已被删除,因为它不参与分组。更进一步,我将添加一个名为 subject_i_want 的虚构表,其中包含您想要的 N 行主题。

SELECT document_id
  FROM document_subjects
 WHERE subject_id IN (SELECT subject_id FROM subjects_i_want)
 GROUP BY document_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM subjects_i_want)

显然,subject_i_want 可以换成另一个子查询、临时表或其他任何东西。但是,一旦有了这个 document_id 列表,您就可以在更大查询的子选择中使用它。

SELECT document_id, subject_id, ...
  FROM document_subjects
 WHERE document_id IN(
        SELECT document_id
          FROM document_subjects
          WHERE subject_id IN (SELECT subject_id FROM subjects_i_want)
          GROUP BY document_id
         HAVING COUNT(*) = (SELECT COUNT(*) FROM subjects_i_want))

或者无论如何。

I assume that the natrual key of this table is document_id + subject_id, and that id is a surrogate; IOW, document_id and subject_id are unique. As such, I'm just going to pretend it doesn't exist and that a unique constraint is on the natural key.

Let's start with the obvious.

SELECT document_id, subject_id
  FROM document_subjects
 WHERE subject_id IN (17,76)

That gets you everything you want plus stuff you don't want. So all we need to do is filter out the other stuff. The "other stuff" is groups of rows having a count that is not equal to the count of the desired subjects.

SELECT document_id
  FROM document_subjects
 WHERE subject_id IN (17,76)
 GROUP BY document_id
HAVING COUNT(*) = 2

Note that subject_id is removed because it doesn't participate in grouping. Taking this one step further, i'm going to add an imaginary table called subjects_i_want that contains N rows of subjects you want.

SELECT document_id
  FROM document_subjects
 WHERE subject_id IN (SELECT subject_id FROM subjects_i_want)
 GROUP BY document_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM subjects_i_want)

Obviously subjects_i_want could be swapped out for another subquery, temporary table, or whatever. But, once you have this list of document_id, you can use it within a subselect of a bigger query.

SELECT document_id, subject_id, ...
  FROM document_subjects
 WHERE document_id IN(
        SELECT document_id
          FROM document_subjects
          WHERE subject_id IN (SELECT subject_id FROM subjects_i_want)
          GROUP BY document_id
         HAVING COUNT(*) = (SELECT COUNT(*) FROM subjects_i_want))

Or whatever.

星軌x 2024-08-11 00:19:02

使用 Oracle(或任何允许使用 with 子句的数据库)。这允许仅定义一次 subject_id 值。

with t as (select distinct document_id from table1 where subject_id in (17,76) )
select document_id from table1 where subject_id in (select subject_id from t)
group by document_id 
having count(*) = (select count (*) from t);

Using Oracle (or any database that allows the with clause). This allows definition of the subject_id values exactly once.

with t as (select distinct document_id from table1 where subject_id in (17,76) )
select document_id from table1 where subject_id in (select subject_id from t)
group by document_id 
having count(*) = (select count (*) from t);
王权女流氓 2024-08-11 00:19:02

这是一个非常有趣的问题。

我假设您想要一个更通用的查询,但这就是在您始终具有相同数量的主题(例如两个)的情况下我会做的:

 SELECT T.id, T.document_id, T.subject_id
   FROM table T
        INNER JOIN table T1 ON T.document_id = T1.document_id AND T1.subject_ID = 17
        INNER JOIN table T2 ON T.document_id = T2.document_id AND T2.subject_ID = 76            

当然,您可以添加另一个 INNER JOIN 来添加另一个主题ID..但我承认这不是一个很好的通用解决方案。

That's a very interesting question.

I'm assuming you would like a more generalized query, but this is what I would do in the case where you always have the same number of subjects (say two):

 SELECT T.id, T.document_id, T.subject_id
   FROM table T
        INNER JOIN table T1 ON T.document_id = T1.document_id AND T1.subject_ID = 17
        INNER JOIN table T2 ON T.document_id = T2.document_id AND T2.subject_ID = 76            

Of course, you could add yet another INNER JOIN to add another subject ID.. But I admit it's not a very good general solution.

夜深人未静 2024-08-11 00:19:02
select document_id from table1
 where subject_id in (17, 76)
 group by document_id
having count(distinct subject_id) = 2
select document_id from table1
 where subject_id in (17, 76)
 group by document_id
having count(distinct subject_id) = 2
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文