查找比较表列的顶部匹配项

发布于 2025-02-01 12:08:53 字数 803 浏览 2 评论 0 原文

我有一个数据库,最多可从不同来源混合400个桌子。我需要按列相似性将这些表分组到Excel文件中(考虑到表具有0、1、2或所有具有相同名称的列)。挑战是示例如下:

fac.table_1 have columns C1, C2, C3, C4 and C5
dim.table_2 has columns C1, C3, and C5
stg.table_3 has columns C1, C6, and C7
stg.table_4 has columns C2, and C99

...

预期的结果应该是:

sch_name | table_name | ncols | nmatches
  dim    |   table_2  |   3   |   3
  stg    |   table_3  |   3   |   1
  stg    |   table_4  |   2   |   1

我认为方法是将类似代码与计数或相交的内容一起使用,将我想与他人进行比较的表名中的位置:

    SELECT
       schemas.name sch_nm,
       tables.name tb_nm,
       columns.name col_nm
    FROM sys.tables
       LEFT JOIN sys.columns ON tables.object_id = columns.object_id
       LEFT JOIN sys.schemas ON tables.schema_id = schemas.schema_id

I have a database with up to 400 tables mixed from different sources. I need to group those tables in an excel file by column similarity (considering that tables have 0, 1, 2, or all columns with the same name). The challenge is as the example follows:

fac.table_1 have columns C1, C2, C3, C4 and C5
dim.table_2 has columns C1, C3, and C5
stg.table_3 has columns C1, C6, and C7
stg.table_4 has columns C2, and C99

...

The expected result should be:

sch_name | table_name | ncols | nmatches
  dim    |   table_2  |   3   |   3
  stg    |   table_3  |   3   |   1
  stg    |   table_4  |   2   |   1

I think the way is to use something like this code together with COUNT or INTERSECT, inserting in WHERE the table name that I want to compare with others:

    SELECT
       schemas.name sch_nm,
       tables.name tb_nm,
       columns.name col_nm
    FROM sys.tables
       LEFT JOIN sys.columns ON tables.object_id = columns.object_id
       LEFT JOIN sys.schemas ON tables.schema_id = schemas.schema_id

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

叶落知秋 2025-02-08 12:08:53

您想计算另一个表中列名称的列数吗?

select sch_name, tbl_name, 
       ncols      = count(*), 
       nmatches   = sum(case when col_cnt > 1 then 1 else 0 end),
       percentage = sum(case when col_cnt > 1 then 1 else 0 end) * 100 / count(*) 
from
(
    select sch_name = s.name, 
           tbl_name = t.name,
           col_name = c.name,
           col_cnt  = count(c.name) over(partition by c.name)
    from   sys.schemas s
           inner join sys.tables t  on s.schema_id = t.schema_id
           inner join sys.columns c on t.object_id = c.object_id
    where  t.name in ('table1', 'table2', 'table3', 'table4')
) c       
where tbl_name not in ('table1')
group by sch_name, tbl_name
order by c.tbl_name;

结果:

sch_name tbl_name ncols nmatch
face table_1 5 4
dim table_2 3 3
stg table_3 3 1
stg table_4 2 1

db<>小提琴演示

You wanted to count the number of columns which the column name exists in another table ?

select sch_name, tbl_name, 
       ncols      = count(*), 
       nmatches   = sum(case when col_cnt > 1 then 1 else 0 end),
       percentage = sum(case when col_cnt > 1 then 1 else 0 end) * 100 / count(*) 
from
(
    select sch_name = s.name, 
           tbl_name = t.name,
           col_name = c.name,
           col_cnt  = count(c.name) over(partition by c.name)
    from   sys.schemas s
           inner join sys.tables t  on s.schema_id = t.schema_id
           inner join sys.columns c on t.object_id = c.object_id
    where  t.name in ('table1', 'table2', 'table3', 'table4')
) c       
where tbl_name not in ('table1')
group by sch_name, tbl_name
order by c.tbl_name;

Result :

sch_name tbl_name ncols nmatches
fac table_1 5 4
dim table_2 3 3
stg table_3 3 1
stg table_4 2 1

db<>fiddle demo

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文