MYSQL:自连接时避免重复记录的笛卡尔积

发布于 2024-09-16 06:43:18 字数 1131 浏览 8 评论 0原文

有两个表:表 A 和表 B。它们具有相同的列,并且数据几乎相同。它们都有自动递增的 ID,两者之间唯一的区别是它们对相同的记录有不同的 ID。

在这些列中,有一个IDENTIFIER列不是唯一的,即两个表中都有(很少)具有相同IDENTIFIER的记录。

现在,为了找到表 A 的 ID 和表 B 的 ID 之间的对应关系,我必须在 IDENTIFIER 列上连接这两个表(出于所有目的,它都是自连接),类似于:

SELECT A.ID, B.ID
FROM A INNER JOIN B ON A.IDENTIFIER = B.IDENTIFIER

但是,由于 IDENTIFIER 不唯一,这会生成 IDENTIFIER 重复值的所有可能组合,我不希望这样。

理想情况下,我想生成一个具有重复 IDENTIFIER 值的 ID 之间的一种关联(基于其顺序)。例如,假设表 A(以及表 B)中有 6 个具有不同 ID 和相同 IDENTIFIER 值的记录:

A                                 B
IDENTIFIER:'ident105', ID:10  ->  IDENTIFIER:'ident105', ID:3
IDENTIFIER:'ident105', ID:20  ->  IDENTIFIER:'ident105', ID:400
IDENTIFIER:'ident105', ID:23  ->  IDENTIFIER:'ident105', ID:420
IDENTIFIER:'ident105', ID:100 ->  IDENTIFIER:'ident105', ID:512
IDENTIFIER:'ident105', ID:120 ->  IDENTIFIER:'ident105', ID:513
IDENTIFIER:'ident105', ID:300 ->  IDENTIFIER:'ident105', ID:798

这将是理想的情况。 无论如何,无论 ID 的顺序如何生成一对一关联的方法仍然可以(但不是首选)。

感谢您抽出时间,

西尔维奥

There are two tables: table A and table B. They have the same columns and the data is practically identical. They both have auto-incremented IDs, the only difference between the two is that they have different IDs for the same records.

Among the columns, there is an IDENTIFIER column which is not unique, i.e. there are (very few) records with the same IDENTIFIER in both tables.

Now, in order to find a correspondence between the IDs of table A and the IDs of table B, I have to join these two tables (for all purposes it's a self-join) on the IDENTIFIER column, something like:

SELECT A.ID, B.ID
FROM A INNER JOIN B ON A.IDENTIFIER = B.IDENTIFIER

But, being IDENTIFIER non-unique, this generates every possible combination of the repeating values of IDENTIFIER, I don't want that.

Ideally, I would like to generate an one to one association between IDs that have repeating IDENTIFIER values, based on their order. For example, supposing that there are six records with different ID and the same IDENTIFIER value in table A (and thus in table B):

A                                 B
IDENTIFIER:'ident105', ID:10  ->  IDENTIFIER:'ident105', ID:3
IDENTIFIER:'ident105', ID:20  ->  IDENTIFIER:'ident105', ID:400
IDENTIFIER:'ident105', ID:23  ->  IDENTIFIER:'ident105', ID:420
IDENTIFIER:'ident105', ID:100 ->  IDENTIFIER:'ident105', ID:512
IDENTIFIER:'ident105', ID:120 ->  IDENTIFIER:'ident105', ID:513
IDENTIFIER:'ident105', ID:300 ->  IDENTIFIER:'ident105', ID:798

That would be ideal.
Anyway, a way to generate a one to one association regardless of the order of the IDs would still be ok (but not preferred).

Thanks for your time,

Silvio

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

半窗疏影 2024-09-23 06:43:18
select a_numbered.id, a_numbered.identifier, b_numbered.id from 
(
select a.*,
       case 
          when @identifier = a.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := a.identifier
  from a
  join (select @rownum := 0, @identifier := null) r
order by a.identifier

) a_numbered join (
select b.*,
       case 
          when @identifier = b.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := b.identifier
  from b
  join (select @rownum := 0, @identifier := null) r
order by b.identifier

) b_numbered 
on a_numbered.rn=b_numbered.rn and a_numbered.identifier=b_numbered.identifier
select a_numbered.id, a_numbered.identifier, b_numbered.id from 
(
select a.*,
       case 
          when @identifier = a.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := a.identifier
  from a
  join (select @rownum := 0, @identifier := null) r
order by a.identifier

) a_numbered join (
select b.*,
       case 
          when @identifier = b.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := b.identifier
  from b
  join (select @rownum := 0, @identifier := null) r
order by b.identifier

) b_numbered 
on a_numbered.rn=b_numbered.rn and a_numbered.identifier=b_numbered.identifier
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文