SQL Server 重复删除的巧妙方法
SQL Server 2008,我继承了一个像这样的表(74k行):
create table #mess (keycol char(36) , name1 varchar(254) , name2 varchar(254), valuex varchar(254) )
insert into #mess values ('971EC307-8514-450D-AE3A-4E25EA3F3A10' , 'a' , '' , '' )
insert into #mess values ('971EC307-8514-450D-AE3A-4E25EA3F3A10' , '' , 'a' , 'value-a' )
insert into #mess values ('04FD0C0B-FC90-405A-BFD6-C3AF2516E51F' , 'b' , '' , 'value-b' )
insert into #mess values ('578F2893-15E6-4877-9FE6-AC2F4F351143' , 'c' , '' , 'value-c' )
insert into #mess values ('04FD0C0B-FC90-405A-BFD6-C3AF2516E51F' , '' , 'b' , '' )
insert into #mess values ('FAFCBDFE-D49E-4566-882D-0B6628DA59CC' , '' , 'd' , 'value-d' )
这使得这个结果集:
keycol name1 name2 valuex
------------------------------------ ------- -------- ------------
971EC307-8514-450D-AE3A-4E25EA3F3A10 a
971EC307-8514-450D-AE3A-4E25EA3F3A10 a value-a
04FD0C0B-FC90-405A-BFD6-C3AF2516E51F b value-b
578F2893-15E6-4877-9FE6-AC2F4F351143 c value-c
04FD0C0B-FC90-405A-BFD6-C3AF2516E51F b
FAFCBDFE-D49E-4566-882D-0B6628DA59CC d value-d
我需要使它像这样(根据keycol
匹配的时间来消除重复和折叠数据两行之间,如果 name2
为空,则使用 name1
,反之亦然,但始终使用 name1
且始终使用非空 valuex< /代码> 柱子)。所有想法都受到赞赏。
谢谢。
keycol name1 valuex
------------------------------------ ------- ------------
971EC307-8514-450D-AE3A-4E25EA3F3A10 a value-a
04FD0C0B-FC90-405A-BFD6-C3AF2516E51F b value-b
578F2893-15E6-4877-9FE6-AC2F4F351143 c value-c
FAFCBDFE-D49E-4566-882D-0B6628DA59CC d value-d
SQL Server 2008, I have inherited a table like this (74k rows):
create table #mess (keycol char(36) , name1 varchar(254) , name2 varchar(254), valuex varchar(254) )
insert into #mess values ('971EC307-8514-450D-AE3A-4E25EA3F3A10' , 'a' , '' , '' )
insert into #mess values ('971EC307-8514-450D-AE3A-4E25EA3F3A10' , '' , 'a' , 'value-a' )
insert into #mess values ('04FD0C0B-FC90-405A-BFD6-C3AF2516E51F' , 'b' , '' , 'value-b' )
insert into #mess values ('578F2893-15E6-4877-9FE6-AC2F4F351143' , 'c' , '' , 'value-c' )
insert into #mess values ('04FD0C0B-FC90-405A-BFD6-C3AF2516E51F' , '' , 'b' , '' )
insert into #mess values ('FAFCBDFE-D49E-4566-882D-0B6628DA59CC' , '' , 'd' , 'value-d' )
Which makes this result set:
keycol name1 name2 valuex
------------------------------------ ------- -------- ------------
971EC307-8514-450D-AE3A-4E25EA3F3A10 a
971EC307-8514-450D-AE3A-4E25EA3F3A10 a value-a
04FD0C0B-FC90-405A-BFD6-C3AF2516E51F b value-b
578F2893-15E6-4877-9FE6-AC2F4F351143 c value-c
04FD0C0B-FC90-405A-BFD6-C3AF2516E51F b
FAFCBDFE-D49E-4566-882D-0B6628DA59CC d value-d
I need to make it like this (de-duplicating and collapsing the data, based on when keycol
matches between two rows, and using name1
if name2
is empty and vice versa but always using name1
and always using non-empty valuex
column). All ideas appreciated.
Thanks.
keycol name1 valuex
------------------------------------ ------- ------------
971EC307-8514-450D-AE3A-4E25EA3F3A10 a value-a
04FD0C0B-FC90-405A-BFD6-C3AF2516E51F b value-b
578F2893-15E6-4877-9FE6-AC2F4F351143 c value-c
FAFCBDFE-D49E-4566-882D-0B6628DA59CC d value-d
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
顺便说一句——右边的 3 列本质上是相同的。您可以删除最右边的值(例如“value-a”等值)而不会受到任何惩罚,然后将其他两个组合起来。
无论如何 - 几种不同的方式 - 使用 DISTINCT,或者 GROUP BY 来获得你想要的:
使用 DISTINCT:
使用 GROUP BY:
BTW -- The 3 columns on the right are, essentially, identical. You can drop the right-most (values like 'value-a', etc.) with no penalty, then, combine the other two.
Anyway -- a couple of different ways -- using DISTINCT, or alternatively, GROUP BY to get what you want:
Using DISTINCT:
Using GROUP BY:
像这样的东西吗?
Something like this?
您可以使用 COALESCE() 函数。它接受任意数量的参数,并返回第一个非 NULL 值作为其结果。
编辑:
SELECT keycol, COALESCE(name1, name2), valuex GROUP BY keycol好的。显然是错误的..感谢您指出。我现在就去完成醒来......星期一......
You can use the COALESCE() function. It takes any number of arguments, and returns the first non-NULL value as its result.
edit:
SELECT keycol, COALESCE(name1, name2), valuex GROUP BY keycolOk. Obviously wrong.. thanks for pointing it out. I'll just go finish waking up now.... mondays...
看起来数据结构似乎总是只需要每个 keycol 的非 NULL valuex 行。您应该能够实现这一目标,并通过使用 UNION 消除 NULL valuex 行来将 name1 和 name2 折叠在一起,如下所示:
如果每个 keycol 始终只有一个有效的 valuex 行,则这应该可行。如果您的数据布局并非如此,请告诉我,我将进一步修改。
It looks as if the data structure is such that you always just want the non-NULL valuex row for each keycol. You should be able to achieve this, and collapse name1 and name2 together, by eliminating the NULL valuex rows with a UNION like so:
If there is always just one valid valuex row for each keycol, this should work. If it isn't the case that this is how your data is laid out, let me know and I'll modify further.