重命名重复行
这是我的问题的一个简化示例。我有一个表,其中有一个包含重复条目的“名称”列:
ID Name
--- ----
1 AAA
2 AAA
3 AAA
4 BBB
5 CCC
6 CCC
7 DDD
8 DDD
9 DDD
10 DDD
执行像 SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name 这样的 GROUP BY 会导致以下结果:
Name Count
---- -----
AAA 3
BBB 1
CCC 2
DDD 4
我只关心重复项,因此我将添加一个 HAVING 子句,SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name HAVING COUNT(*) > 1
:
Name Count
---- -----
AAA 3
CCC 2
DDD 4
到目前为止很简单,但现在事情变得棘手:我需要一个查询来获取所有重复记录,但在“名称”列中添加了一个很好的递增指示器。结果应如下所示:
ID Name
--- --------
1 AAA
2 AAA (2)
3 AAA (3)
5 CCC
6 CCC (2)
7 DDD
8 DDD (2)
9 DDD (3)
10 DDD (4)
注意带有“BBB”的第 4 行被排除,第一个重复项保留原始名称。
使用 EXISTS
语句为我提供了所需的所有记录,但如何创建新的 Name 值?
SELECT * FROM Table AS T1
WHERE EXISTS (
SELECT Name, COUNT(*) AS [Count]
FROM Table
GROUP BY Name
HAVING (COUNT(*) > 1) AND (Name = T1.Name))
ORDER BY Name
我需要创建一个 UPDATE 语句来修复所有重复项,即按照此模式更改名称。
更新: 现在想通了。这是我缺少的 PARTITION BY 子句。
Here's a simplified example of my problem. I have a table where there's a "Name" column with duplicate entries:
ID Name
--- ----
1 AAA
2 AAA
3 AAA
4 BBB
5 CCC
6 CCC
7 DDD
8 DDD
9 DDD
10 DDD
Doing a GROUP BY like SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name
results in this:
Name Count
---- -----
AAA 3
BBB 1
CCC 2
DDD 4
I'm only concerned about the duplicates, so I'll add a HAVING clause, SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name HAVING COUNT(*) > 1
:
Name Count
---- -----
AAA 3
CCC 2
DDD 4
Trivial so far, but now things get tricky: I need a query to get me all the duplicate records, but with a nice incrementing indicator added to the Name column. The result should look something like this:
ID Name
--- --------
1 AAA
2 AAA (2)
3 AAA (3)
5 CCC
6 CCC (2)
7 DDD
8 DDD (2)
9 DDD (3)
10 DDD (4)
Note row 4 with "BBB" is excluded, and the first duplicate keeps the original Name.
Using an EXISTS
statement gives me all the records I need, but how do I go about creating the new Name value?
SELECT * FROM Table AS T1
WHERE EXISTS (
SELECT Name, COUNT(*) AS [Count]
FROM Table
GROUP BY Name
HAVING (COUNT(*) > 1) AND (Name = T1.Name))
ORDER BY Name
I need to create an UPDATE statement that will fix all the duplicates, i.e. change the Name as per this pattern.
Update:
Figured it out now. It was the PARTITION BY clause I was missing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您想要更新语句,您可以使用几乎相同的结构:
If you want an update statement you can use pretty much the same structure:
直接更新子查询即可:
Just update the subquery directly:
这将为您提供您最初要求的内容。对于更新语句,您需要花一些时间更新前 1 个
This will get you what you originally asked for. For the update statement, you'll want to do a while and update the top 1
根本不需要进行
UPDATE
。以下将根据需要创建用于INSERT
的表There's no need to do an
UPDATE
at all. The following will create the table forINSERT
as desired这是一个更简单的 UPDATE 语句:
Here's an even simpler UPDATE statement: