我怎样才能“减去” 一张桌子与另一张桌子?

发布于 2024-07-13 08:34:05 字数 199 浏览 4 评论 0原文

我有一个主表 A,大约有 900 万行。 另一个表 B(相同结构)包含表 A 中的约 28K 行。 从表 A 中删除 B 的所有内容的最佳方法是什么?

所有列(~10)的组合都是唯一的。 没有什么比唯一密钥的形式更重要了。

I have a master table A, with ~9 million rows. Another table B (same structure) has ~28K rows from table A. What would be the best way to remove all contents of B from table A?

The combination of all columns (~10) are unique. Nothing more in the form a of a unique key.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

裂开嘴轻声笑有多痛 2024-07-20 08:34:05

如果您有足够的权限,您可以创建一个新表并将其重命名为 A。要创建新表,您可以使用以下脚本:

CREATE TABLE TEMP_A AS
SELECT *
FROM   A
MINUS
SELECT *
FROM   B

这应该表现得很好。

If you have sufficient rights you can create a new table and rename that one to A. To create the new table you can use the following script:

CREATE TABLE TEMP_A AS
SELECT *
FROM   A
MINUS
SELECT *
FROM   B

This should perform pretty good.

小女人ら 2024-07-20 08:34:05
DELETE FROM TableA WHERE ID IN(SELECT ID FROM TableB)

应该管用。 不过可能需要一段时间。

DELETE FROM TableA WHERE ID IN(SELECT ID FROM TableB)

Should work. Might take a while though.

ら栖息 2024-07-20 08:34:05

一种方法,只需列出所有列

delete table a
where exists (select 1 from table b where b.Col1= a.Col1 
AND b.Col2= a.Col2 
AND b.Col3= a.Col3
AND b.Col4= a.Col4)

one way, just list out all the columns

delete table a
where exists (select 1 from table b where b.Col1= a.Col1 
AND b.Col2= a.Col2 
AND b.Col3= a.Col3
AND b.Col4= a.Col4)
盛装女皇 2024-07-20 08:34:05
Delete t2 
from t1 
inner join t2 
  on t1.col1 = t2.col1
  and t1.col2 = t2.col2
  and t1.col3 = t2.col3
  and t1.col4 = t2.col4
  and t1.col5 = t2.col5
  and t1.col6 = t2.col6
  and t1.col7 = t2.col7
  and t1.col8 = t2.col8
  and t1.col9 = t2.col9
  and t1.col10 = t2.col0

这可能会非常慢,因为您必须对每个列建立索引,这在这种大小的表没有主键的环境中极不可能,因此在非高峰期间进行。 是什么让您拥有一个包含 900 万条记录且没有主键的表?

Delete t2 
from t1 
inner join t2 
  on t1.col1 = t2.col1
  and t1.col2 = t2.col2
  and t1.col3 = t2.col3
  and t1.col4 = t2.col4
  and t1.col5 = t2.col5
  and t1.col6 = t2.col6
  and t1.col7 = t2.col7
  and t1.col8 = t2.col8
  and t1.col9 = t2.col9
  and t1.col10 = t2.col0

This is likely to be very slow as you would have to have every col indexed which is highly unlikely in an environment when a table this size has no primary key, so do it during off peak. What possessed you to have a table with 9 million records and no primary key?

往昔成烟 2024-07-20 08:34:05

如果这是您必须定期执行的操作,那么第一个选择应该是尝试改进数据库设计(寻找主键,尝试使“连接”条件出现在尽可能少的列上) 。

如果这是不可能的,则不同的第二个选项是计算出每列的“选择性”(即每列有多少个“不同”值,“名称”比“地址国家/地区”比“地址国家/地区”更具选择性)男/女')。
我建议的一般类型的语句是这样的:

Delete from tableA
where exists (select * from tableB
 where tableA.colx1 = tableB.colx1
 and tableA.colx2 = tableB.colx2
 etc. and tableA.colx10 = tableB.colx10).

想法是按照选择性的顺序列出列,并在 tableB 上的 colx1、colx2 等上构建索引。 表 B 中的确切列数将是一些试验和测量的结果。 (用删除语句的改进时间来抵消在 tableB 上构建索引的时间。)

如果这只是一次性操作,我只会选择上面概述的慢速方法之一。 当你可以在回家之前开始发表声明时,可能不值得花太多时间考虑这个问题......

If this is something you'll have to do on a regular basis, the first choice should be to try to improve the database design (looking for primary keys, trying to get the "join" condition to be on as few columns as possible).

If that is not possible, the distinct second option is to figure out the "selectivity" of each of the columns (i.e. how many "different" values does each column have, 'name' would be more selective than 'address country' than 'male/female').
The general type of statement I'd suggest would be like this:

Delete from tableA
where exists (select * from tableB
 where tableA.colx1 = tableB.colx1
 and tableA.colx2 = tableB.colx2
 etc. and tableA.colx10 = tableB.colx10).

The idea is to list the columns in order of the selectivity and build an index on colx1, colx2 etc. on tableB. The exact number of columns in tableB would be a result of some trial&measure. (Offset the time for building the index on tableB with the improved time of the delete statement.)

If this is just a one time operation, I'd just pick one of the slow methods outlined above. It's probably not worth the effort to think too much about this when you can just start a statement before going home ...

╭ゆ眷念 2024-07-20 08:34:05

是否有可以使用的键值(或多个值)?

就像是

DELETE a
FROM tableA a
INNER JOIN tableB b
on b.id = a.id

Is there a key value (or values) that can be used?

something like

DELETE a
FROM tableA a
INNER JOIN tableB b
on b.id = a.id
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文