T-SQL:删除所有重复行但保留一个
我有一个包含大量行的表。不允许重复,但由于行的创建方式存在问题,我知道该表中有一些重复。 我需要从关键列的角度消除多余的行。其他一些列的数据可能略有不同,但我不关心这一点。不过,我仍然需要保留其中一行。 SELECT DISTINCT 不起作用,因为它对所有列进行操作,并且我需要根据键列抑制重复项。
如何删除多余的行但仍然有效地保留一行?
I have a table with a very large amount of rows. Duplicates are not allowed but due to a problem with how the rows were created I know there are some duplicates in this table.
I need to eliminate the extra rows from the perspective of the key columns. Some other columns may have slightly different data but I do not care about that. I still need to keep one of these rows however. SELECT DISTINCT won't work because it operates on all columns and I need to suppress duplicates based on the key columns.
How can I delete the extra rows but still keep one efficiently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您没有说明您使用的是什么版本,但在 SQL 2005 及更高版本中,您可以使用带有 OVER 子句。它有点像这样:
尝试一下,看看你会得到什么。
(编辑:为了提供帮助,有人编辑了 CTE 中的 ORDER BY 子句。需要明确的是,您可以在此处按您想要的任何内容进行排序,它不必是返回的列之一事实上,这里的一个常见用例是“foo,bar”是组标识符,“baz”是某种时间戳,为了保持最新,您需要
ORDER。 BY baz 描述
)You didn't say what version you were using, but in SQL 2005 and above, you can use a common table expression with the OVER Clause. It goes a little something like this:
Play around with it and see what you get.
(Edit: In an attempt to be helpful, someone edited the
ORDER BY
clause within the CTE. To be clear, you can order by anything you want here, it needn't be one of the columns returned by the cte. In fact, a common use-case here is that "foo, bar" are the group identifier and "baz" is some sort of time stamp. In order to keep the latest, you'd doORDER BY baz desc
)查询示例:
这里的
fields
是您想要对重复行进行分组的列。Example query:
Here
fields
are column on which you want to group the duplicate rows.这是我对此的看法,并提供了一个可运行的示例。 注意这仅适用于
Id
唯一且其他列中有重复值的情况。结果:
不知道为什么这就是我首先想到的......绝对不是最简单的方法,但它有效。
Here's my twist on it, with a runnable example. Note this will only work in the situation where
Id
is unique, and you have duplicate values in other columns.And the results:
Not sure why that's what I thought of first... definitely not the simplest way to go but it works.