删除除一条重复记录之外的所有记录

发布于 2024-11-05 04:23:21 字数 827 浏览 1 评论 0原文

我有一个表,应该记录给定配置文件的访问者(用户 ID 到用户 ID 对)。事实证明我的 SQL 查询有点不对劲,并且生成了多个对,而不是按预期生成单个对。事后看来,我应该对每个 id+id 对强制执行唯一的约束。

现在,我该如何清理桌子呢?我想要做的是删除所有重复的对并只留下一对。

例如,将其更改

23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...

为:

23515 -> 52525 date_visited
12345 -> 54321 date_visited

更新: 这是所要求的表结构:

id  int(10)         UNSIGNED    Non     Aucun   AUTO_INCREMENT
profile_id  int(10)         UNSIGNED    Non     0 
visitor_id  int(10)         UNSIGNED    Non     0
date_visited    timestamp           Non     CURRENT_TIMESTAMP   

I have a table that is supposed to keep a trace of visitors to a given profile (user id to user id pair). It turns out my SQL query was a bit off and is producing multiple pairs instead of single ones as intended. With hindsight I should have enforced a unique constraint on each id+id pair.

Now, how could I go about cleaning up the table? What I want to do is delete all duplicate pairs and leave just one.

So for example change this:

23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...

Into this:

23515 -> 52525 date_visited
12345 -> 54321 date_visited

Update: Here is the table structure as requested:

id  int(10)         UNSIGNED    Non     Aucun   AUTO_INCREMENT
profile_id  int(10)         UNSIGNED    Non     0 
visitor_id  int(10)         UNSIGNED    Non     0
date_visited    timestamp           Non     CURRENT_TIMESTAMP   

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

櫻之舞 2024-11-12 04:23:21

ANSI SQL解决方案

在子查询中使用group by:

delete from my_tab where id not in 
(select min(id) from my_tab group by profile_id, visitor_id);

您需要某种唯一标识符(这里我使用的是id)。

MySQL 解决方案

正如 @JamesPoulson 所指出的,这会导致 MySQL 出现语法错误;正确的解决方案是(如 James 的回答所示):

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);

ANSI SQL Solution

Use group by in a subquery:

delete from my_tab where id not in 
(select min(id) from my_tab group by profile_id, visitor_id);

You need some kind of unique identifier(here, I'm using id).

MySQL Solution

As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);
小巷里的女流氓 2024-11-12 04:23:21

这是 Frank Schmitt 的解决方案,其中有一个小解决方法,利用临时表来允许他的解决方案在 MySQL 上运行:

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)

Here's Frank Schmitt's solution with a small workaround utilizing a temporary table to allow his solution to work on MySQL:

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)
翻了热茶 2024-11-12 04:23:21

这会起作用:

With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from 
table_name
)
Delete from NewCTE where RowNumber > 1

This will work:

With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from 
table_name
)
Delete from NewCTE where RowNumber > 1
予囚 2024-11-12 04:23:21

选择所有唯一行
将它们复制到新的临时表
截断原表
将临时表数据复制到原始表

这就是我会做的。我不确定是否有 1 个查询可以为您完成所有这些操作。

Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table

That's what I'd do. I'm not sure if there's 1 query that would do all this for you.

空城缀染半城烟沙 2024-11-12 04:23:21

上述答案都不起作用......要求是删除所有重复项,除了每组中的一个之外......但基于多个列数据......

尝试这个,

 SET SERVEROUTPUT ON
 declare
 name integer := 1;
 begin    
 
    for test in (
        select tablecolumn1 , tablecolumn2, tablecolumn3, count(1) from 
        yourtable group by tablecolumn1, tablecolumn2, tablecolumn3 
        having count(1) > 1
    )
    loop
           if name <= 1000 then -- for incremental update
           
                 name := name+1;
            
                 delete from test b where 
                 b.tablecolumn1 = test.tablecolumn1  
                 and  b.tablecolumn2 = test.tablecolumn2
                 and  b.tablecolumn3 = test.tablecolumn3
                 and  rownum = 1;            
             
            end if;
    end loop;
    
    DBMS_OUTPUT.PUT_LINE(name);
     
 end;

None of the above answers works..... the requirement is to delete all the duplicates and except one from each set.. but based on more than one column data..

try this,

 SET SERVEROUTPUT ON
 declare
 name integer := 1;
 begin    
 
    for test in (
        select tablecolumn1 , tablecolumn2, tablecolumn3, count(1) from 
        yourtable group by tablecolumn1, tablecolumn2, tablecolumn3 
        having count(1) > 1
    )
    loop
           if name <= 1000 then -- for incremental update
           
                 name := name+1;
            
                 delete from test b where 
                 b.tablecolumn1 = test.tablecolumn1  
                 and  b.tablecolumn2 = test.tablecolumn2
                 and  b.tablecolumn3 = test.tablecolumn3
                 and  rownum = 1;            
             
            end if;
    end loop;
    
    DBMS_OUTPUT.PUT_LINE(name);
     
 end;
紙鸢 2024-11-12 04:23:21

如果您使用 SQL,您可以手动删除重复的行,保留一个条目,只需按照以下过程操作:

  1. 进入有重复数据的表。
  2. 应用过滤器以隔离每个 ID 的重复数据
  3. 选择要删除的所有行。
  4. 按删除并保存结果。
  5. 对您有重复条目的每个 ID 重复此过程。

这是一个漫长的过程,但您可以立即实时看到结果。

希望这个解决方案对您有用!

If you are using SQL you can manually delete the duplicate rows keeping one entry just follow this procedure:

  1. Go into your table where you have duplicate data.
  2. Apply the filter to segregate duplicate data for each individual id
  3. Select all the rows you want to delete.
  4. Press delete and save the result.
  5. Repeat the process for each id you have duplicate entries for.

It's a long procedure but you can see the results immediately in real-time.

Hope this solution worked for you!!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文