查找比较表列的顶部匹配项

发布于 2025-02-01 12:08:53 字数 803 浏览 2 评论 0 原文

我有一个数据库，最多可从不同来源混合400个桌子。我需要按列相似性将这些表分组到Excel文件中（考虑到表具有0、1、2或所有具有相同名称的列）。挑战是示例如下：

fac.table_1 have columns C1, C2, C3, C4 and C5
dim.table_2 has columns C1, C3, and C5
stg.table_3 has columns C1, C6, and C7
stg.table_4 has columns C2, and C99

...

预期的结果应该是：

sch_name | table_name | ncols | nmatches
  dim    |   table_2  |   3   |   3
  stg    |   table_3  |   3   |   1
  stg    |   table_4  |   2   |   1

我认为方法是将类似代码与计数或相交的内容一起使用，将我想与他人进行比较的表名中的位置：

    SELECT
       schemas.name sch_nm,
       tables.name tb_nm,
       columns.name col_nm
    FROM sys.tables
       LEFT JOIN sys.columns ON tables.object_id = columns.object_id
       LEFT JOIN sys.schemas ON tables.schema_id = schemas.schema_id

原文

I have a database with up to 400 tables mixed from different sources. I need to group those tables in an excel file by column similarity (considering that tables have 0, 1, 2, or all columns with the same name). The challenge is as the example follows:

fac.table_1 have columns C1, C2, C3, C4 and C5
dim.table_2 has columns C1, C3, and C5
stg.table_3 has columns C1, C6, and C7
stg.table_4 has columns C2, and C99

...

The expected result should be:

sch_name | table_name | ncols | nmatches
  dim    |   table_2  |   3   |   3
  stg    |   table_3  |   3   |   1
  stg    |   table_4  |   2   |   1

I think the way is to use something like this code together with COUNT or INTERSECT, inserting in WHERE the table name that I want to compare with others:

    SELECT
       schemas.name sch_nm,
       tables.name tb_nm,
       columns.name col_nm
    FROM sys.tables
       LEFT JOIN sys.columns ON tables.object_id = columns.object_id
       LEFT JOIN sys.schemas ON tables.schema_id = schemas.schema_id

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

叶落知秋 2025-02-08 12:08:53

您想计算另一个表中列名称的列数吗？

select sch_name, tbl_name, 
       ncols      = count(*), 
       nmatches   = sum(case when col_cnt > 1 then 1 else 0 end),
       percentage = sum(case when col_cnt > 1 then 1 else 0 end) * 100 / count(*) 
from
(
    select sch_name = s.name, 
           tbl_name = t.name,
           col_name = c.name,
           col_cnt  = count(c.name) over(partition by c.name)
    from   sys.schemas s
           inner join sys.tables t  on s.schema_id = t.schema_id
           inner join sys.columns c on t.object_id = c.object_id
    where  t.name in ('table1', 'table2', 'table3', 'table4')
) c       
where tbl_name not in ('table1')
group by sch_name, tbl_name
order by c.tbl_name;

结果：

sch_name	tbl_name	ncols	nmatch
face	table_1	5	4
dim	table_2	3	3
stg	table_3	3	1
stg	table_4	2	1

db＆lt;＆gt;小提琴演示

You wanted to count the number of columns which the column name exists in another table ?

select sch_name, tbl_name, 
       ncols      = count(*), 
       nmatches   = sum(case when col_cnt > 1 then 1 else 0 end),
       percentage = sum(case when col_cnt > 1 then 1 else 0 end) * 100 / count(*) 
from
(
    select sch_name = s.name, 
           tbl_name = t.name,
           col_name = c.name,
           col_cnt  = count(c.name) over(partition by c.name)
    from   sys.schemas s
           inner join sys.tables t  on s.schema_id = t.schema_id
           inner join sys.columns c on t.object_id = c.object_id
    where  t.name in ('table1', 'table2', 'table3', 'table4')
) c       
where tbl_name not in ('table1')
group by sch_name, tbl_name
order by c.tbl_name;

Result :