如何比较sql中的元组组

发布于 2024-08-13 12:57:40 字数 606 浏览 6 评论 0原文

如何比较 sql 中的元组组:考虑以下示例:

TABLE T1
--------
GROUP     VALUE
-----     -----
A         FOO
A         BAR
X         HHH
X         ZOO

TABLE T2
--------
GROUP     VALUE
-----     -----
B         ZOO
C         FOO
C         BAR

我想编写一个 sql 查询来比较两个表中的值组并报告差异。在所示示例中,表 a 中的组:((A,FOO),(A,BAR)) 与组 ((C,FOO),(C,BAR)) 相同,即使组名称不同。重要的是该组的内容是相同的。最后,查询将报告存在差异:它是 (B,ZOO) 元组。

RESULT
------
GROUP     VALUE
-----     -----
B         ZOO
X         HHH
X         ZOO

尽管 T1 中包含 ZOO 的组 X 在 T2 中具有匹配值:(B,ZOO),但它仍然不是匹配项,因为该组还具有不属于 (B, ZOO) 组的 (X, HHH) 值在T2

How to compare groups of tuples in sql: consider the following example:

TABLE T1
--------
GROUP     VALUE
-----     -----
A         FOO
A         BAR
X         HHH
X         ZOO

TABLE T2
--------
GROUP     VALUE
-----     -----
B         ZOO
C         FOO
C         BAR

I want to write an sql query which compares the groups of values in both tables and reports the differences. In the illustrated example, the group in table a: ((A,FOO),(A,BAR)) is the same as the group ((C,FOO),(C,BAR)) even though the group name is different. What counts is that the contents of the group are the same. Finally the query would report that there is a difference: it is the (B,ZOO) tuple.

RESULT
------
GROUP     VALUE
-----     -----
B         ZOO
X         HHH
X         ZOO

Although the group X containing ZOO in T1 has a matching value in T2: (B,ZOO) it is still not a match because the group also has the (X, HHH) value which is not part of the (B, ZOO) group in T2

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

淡水深流 2024-08-20 12:57:40

像这样,

create table t1 (group_id varchar2(20), value varchar2(20));
create table t2 (group_id varchar2(20), value varchar2(20));

insert into t1 values ('A','FOO');
insert into t1 values ('A','BAR');
insert into t1 values ('X','HHH');
insert into t1 values ('X','ZOO');
insert into t2 values ('C','FOO');
insert into t2 values ('C','BAR');
insert into t2 values ('B','ZOO');


select t1.group_id t1_group,t2.group_id t2_group, 
      --t1.all_val, t2.all_val, 
       case when t1.all_val = t2.all_val then 'match' else 'no match' end coll_match
from 
  (select 'T1' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t1 group by group_id) t1
full outer join
  (select 'T2' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t2 group by group_id) t2
on t1.min_val = t2.min_val and t1.max_val = t2.max_val and t1.cnt_val = t2.cnt_val
/

我根据每组中的最小值、最大值和不同值的数量进行了初步消除,这将有助于处理大型数据集。如果数据集足够小,您可能不需要它们。

这告诉你比赛。您只需额外执行一步即可找到没有任何匹配项的组

select t1_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t1_group
having min(coll_match) is null
/

select t2_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t2_group
having min(coll_match) is null
/

Something like this

create table t1 (group_id varchar2(20), value varchar2(20));
create table t2 (group_id varchar2(20), value varchar2(20));

insert into t1 values ('A','FOO');
insert into t1 values ('A','BAR');
insert into t1 values ('X','HHH');
insert into t1 values ('X','ZOO');
insert into t2 values ('C','FOO');
insert into t2 values ('C','BAR');
insert into t2 values ('B','ZOO');


select t1.group_id t1_group,t2.group_id t2_group, 
      --t1.all_val, t2.all_val, 
       case when t1.all_val = t2.all_val then 'match' else 'no match' end coll_match
from 
  (select 'T1' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t1 group by group_id) t1
full outer join
  (select 'T2' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t2 group by group_id) t2
on t1.min_val = t2.min_val and t1.max_val = t2.max_val and t1.cnt_val = t2.cnt_val
/

I've done a preliminary elimination based on the minmium, maximum and number of distinct values in each group, which would help with large datasets. If the datasets are small enough, you may not need them.

That tells you the matches. You just need to push it out an extra step to find the groups that don't have any matches

select t1_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t1_group
having min(coll_match) is null
/

select t2_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t2_group
having min(coll_match) is null
/
晨与橙与城 2024-08-20 12:57:40

T1 和 T2(两个表)之间的差异可能是这样的:

SELECT
   T1.GROUPNAME,
   T1.VALUE
FROM 
   T1
LEFT JOIN T2
ON T2.Value = T1.Value
WHERE T2.GROUPNAME IS NULL

例如 T1 有:

Foo 100
酒吧 200
ZZZ 333

和 T2 包括:
富100
Bar 200

该查询的结果是 ZZZ 333,它是两个表中唯一不匹配的记录。您甚至可以将 T2 的组名称更改为:

XYZ 100
ZXZ 200

结果仍然是 ZZZ 333。这是根据您所要求的,如果您想要相反的结果,您可以对其进行 UNION,或使用 RIGHT join。

乔恩

Difference between T1 and T2 (two tables) could be this:

SELECT
   T1.GROUPNAME,
   T1.VALUE
FROM 
   T1
LEFT JOIN T2
ON T2.Value = T1.Value
WHERE T2.GROUPNAME IS NULL

For instance T1 has:

Foo 100
Bar 200
ZZZ 333

And T2 includes:
Foo 100
Bar 200

The result of that query is ZZZ 333 it is the only record that does not match in both tables. You could even change T2's group name to say:

XYZ 100
ZXZ 200

And the result is still ZZZ 333. This is per what you are asking, if you want the opposite you can either UNION to it, or use RIGHT join.

Jon

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文