用很少的条件删除所有重复的主题

发布于 2024-10-01 12:12:21 字数 402 浏览 0 评论 0原文

我正在尝试创建将删除所有重复标题的 sql,但必须删除具有以下条件的重复项:

  • 必须仅删除具有相同 object_id 的重复项
  • 必须仅保留最新记录(最大的 topic_id >) (topic_id是每个主题AI的唯一id)

到目前为止我已经做到了(用select测试...)

SELECT topic_id,object_id,title,url,date 
FROM topics GROUP BY title 
HAVING ( COUNT(title) > 1) 
ORDER BY topic_id DESC

但不满足条件。
我正在使用mysql。

I'm trying to make sql who will delete all duplicate titles BUT must delete duplicates with these conditions:

  • must delete only duplicates with same object_id
  • must keep only the newest record (biggest topic_id) (topic_id is the unique id for every topic AI)

So far I've done that (testing with select...)

SELECT topic_id,object_id,title,url,date 
FROM topics GROUP BY title 
HAVING ( COUNT(title) > 1) 
ORDER BY topic_id DESC

But doesn't meet the conditions.
I'm using mysql.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

蒗幽 2024-10-08 12:12:21

MySQL 中,您无法将目标表指定给子查询中的 DML 操作(除非您将其嵌套超过一层,但在这种情况下,您不会得到可靠的结果并且不能使用相关子查询)。

使用JOIN

DELETE  td
FROM    topics td
JOIN    topics ti
ON      ti.object_id = td.object_id
        AND ti.title = td.title
        AND ti.topic_id > td.topic_id;

主题(object_id、title、topic_id)上创建索引,以便快速运行。

In MySQL, you cannot specify the target table to a DML operation in a subquery (unless you nest it more than one level deep, but in this case you won't get reliable results and cannot use correlated subqueries).

Use a JOIN:

DELETE  td
FROM    topics td
JOIN    topics ti
ON      ti.object_id = td.object_id
        AND ti.title = td.title
        AND ti.topic_id > td.topic_id;

Create an index on topics (object_id, title, topic_id) for this to work fast.

回忆凄美了谁 2024-10-08 12:12:21

这将删除所有重复的 object_id 记录,保留具有最高 topic_id 的记录。

delete from topics outer where exists
(
    select 1 from topics inner where 
        outer.object_id = inner.object_id AND 
        inner.topic_id  < outer.topic_id
)

This will delete all duplicate object_id records save the one with the highest topic_id.

delete from topics outer where exists
(
    select 1 from topics inner where 
        outer.object_id = inner.object_id AND 
        inner.topic_id  < outer.topic_id
)
叹沉浮 2024-10-08 12:12:21

首先,如果您有日期字段,则最好通过日期来识别最新记录。

这将起作用:

SELECT topic_id, object_id, title, url, date 
FROM   topics earlier
WHERE  EXISTS 
    (SELECT newest.topic_id 
     FROM   topics newest 
     WHERE  newest.date      > earlier.date 
     AND    newest.object_id = earlier.object_id)

您正在选择存在另一行具有相同 object_id 和更新日期的行。

First, if you have a date field, you'd do better identifying the newest records by their date.

This will work :

SELECT topic_id, object_id, title, url, date 
FROM   topics earlier
WHERE  EXISTS 
    (SELECT newest.topic_id 
     FROM   topics newest 
     WHERE  newest.date      > earlier.date 
     AND    newest.object_id = earlier.object_id)

You're selecting the lines for which exists another line with the same object_id and a more recent date.

享受孤独 2024-10-08 12:12:21

WITH tbl AS (SELECT topic_id, object_id, row_number() over(partition by object_id order by topic_id DESC) as rnum
来自主题)
删除 tbl,其中 rnum > 1

欲了解更多信息,请查看这篇文章:
http://blog.sqlauthority。 com/2009/06/23/sql-server-2005-2008-delete-duplicate-rows/

WITH tbl AS (SELECT topic_id, object_id, row_number() over(partition by object_id order by topic_id DESC) as rnum
FROM topics)
DELETE tbl WHERE rnum > 1

For more information please check this article:
http://blog.sqlauthority.com/2009/06/23/sql-server-2005-2008-delete-duplicate-rows/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文