优化sql更新

发布于 2024-07-13 10:24:44 字数 420 浏览 8 评论 0原文

我们有 2 个表,分别称为 TableToUpdateDates。 我们需要通过查看其他表日期来更新TableToUpdate的EndTimeKey列。 我们运行下面的 sql 来执行此操作,但需要很长时间才能完成。

TableToUpdate 有 6M 条记录。 表日期有 5000 条记录。

我们如何优化它?

感谢您的回复!

update TableToUpdate set
EndTimeKey = DATE_NO
from Dates where EndTime = DATE

We have 2 tables called TableToUpdate and Dates.
We need to update TableToUpdate's EndTimeKey column by looking from other table Dates. We run sql below to do this, but it takes to long to finish.

Table TableToUpdate has 6M records.
Table Dates has 5000 records.

How can we optimize it ?

Thanks for replies !

update TableToUpdate set
EndTimeKey = DATE_NO
from Dates where EndTime = DATE

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

酸甜透明夹心 2024-07-20 10:24:46

您可能会更新 600 万条记录,无论如何这都不会很快。 但是,请查看您的执行计划并查看它是否使用索引。

还可以批量运行,这在更新大量记录时通常会更快。 在数据库负载很少的非工作时间进行更新,这将减少潜在的锁定问题。 确保两个表之间的数据类型相同,这样您就不必进行任何隐式转换。

看看你正在更新的表,上面有触发器吗? 根据触发器的编写方式,这可能会严重减慢许多记录的更新速度(特别是如果不太聪明的人决定在触发器中放置游标或循环而不是编写基于集合的代码)。

另外,这里还有一些我要添加的内容(我还更改了它显示显式显示连接)

update t
set EndTimeKey = DATE_NO
from TableToUpdate t
Join Dates D on t.EndTime = d.DATE
where EndTimeKey <> DATE_NO

更新已经匹配的记录没有意义。

You are updating potentially 6 million records, this is not going to be terribly speedy in any event. However, look at your execution plan and see if it using indexes.

Also run this in batches, that is generally faster when updating large numbers of records. Do the update during off hours when there is little load on the database, this will reduce potential locking issues. Make sure your datatypes are the same between the two tables so you aren't having to do any implicit conversions.

Look at the table you are updating, are there any triggers on it? Depending on how the trigger is written, this could seriously slow down an update of that many records (especially if someone who wasn't too bright decided to put a cursor or a loop in the trigger instead of writing set-based code).

Also here is some thing I would add (I also changed it show explicitly show the join)

update t
set EndTimeKey = DATE_NO
from TableToUpdate t
Join Dates D on t.EndTime = d.DATE
where EndTimeKey <> DATE_NO

No point in updating records that already match.

陪你到最终 2024-07-20 10:24:46

对于如此大量的数据,您可能最好创建一个 SELECT 查询,该查询会生成一个结果集,其中包含更新的值,正如您希望看到的新表一样。 接下来,通过创建表并使用 INSERT INTO 或通过更改 SELECT 添加 INTO 来创建新表,将它们选择到新表中(可能是“NewTableToUpdate”)。

接下来使用 sp_rename 将“TableToUpdate”重命名为“OLDTableToUpdate”,将“NEWTableToUpdate”重命名为“TableToUpdate”,然后按照在原始表上创建索引的方式创建索引。

根据我的经验,我发现这是实现此类重大变化的最快方法。 HTH。

额外的想法...如果您的表上有聚集索引,则将 ORDER BY 添加到您的 SELECT 语句中,以确保它以与聚集索引相同的顺序插入到新表中。 这将显着加快索引创建速度。

With this volume of data you might be best creating a SELECT query which produces a resultset, complete with updated values, as you would like to see the new table. Next, SELECT these into new table (perhaps 'NewTableToUpdate') , either by creating a the table and using INSERT INTO or by changing your SELECT adding an INTO to create the new table.

Next use sp_rename to rename 'TableToUpdate' to 'OLDTableToUpdate' and 'NEWTableToUpdate' to 'TableToUpdate' and then create the indexes as you had them on the original table.

In my experience I've found this to be the quickest means of acheiving big changes like this. HTH.

Extra thought... if you have a clustered index on your table then add an ORDER BY to your SELECT statement to ensure it is inserted into your new table in the same sequence as the clustered index. That'll speed up the index creation in a significant way.

走野 2024-07-20 10:24:46

您可以在相关字段上设置一些索引(按相关顺序),即:endtimekey 和 endtime。 不要对此抱太大期望。 您可以检查的另一件事是是否有其他约束来限制查询结果。

您还可以创建一个视图,为每个 tabletoupdate.endtimekey 返回正确的 date_no。

如果您的 dbms 确实支持此类内容,也许您可​​以编写一个存储过程 - 因为这确实会加速更新。

you can set some indices on the relevant fields (in relevant order) i.e.: endtimekey and endtime. do not expect to much of this. and the other thing you could check is if you have other constraints to limit the query-results.

You could also create a view that returns for every tabletoupdate.endtimekey the correct date_no.

Maybe you could write a stored-procedure if your dbms does support such stuff - because this would really accelerate the update.

我很坚强 2024-07-20 10:24:46

我在这里注意到几件事,EndTimeKey 真的是密钥吗? 如果是这样,它可能有一个索引,如果是这样,速度(或缺乏)将更新索引,同时也进行数据的实际更新,解决方案删除索引,运行更新重新应用索引。

另一个问题可能是 Sql 的事务性质 - 当您进行此更新时,它会记录每个更改,以便在发生故障时可以回滚。 此更新看起来非常简单,因此您可以批量应用它,即

update TableToUpdate setEndTimeKey = DATE_NOfrom Dates where EndTime = DATE
where TableToUpdateId between 1 and 100000

这会将您的更新分成可管理的大小块 - 至少您会知道每个块将花费多长时间。

另一种选择是在 EndTime 列上放置索引,可能必须执行全表扫描。

真正的答案是查看正在生成的查询计划。 正如您所看到的,查询运行缓慢的原因有很多 - 这些只是一些需要快速检查的原因。

I note a couple of things here, is EndTimeKey really a key? If so it may have an index on it, if so the speed (or lack thereof) will be updateing the index whilst also doing the actual update of the data, solution drop the index, run the update re-apply the index.

Another issue could be the transactional nature of Sql - as you do this update it will log every change so it can roll back in the event of a failure. This update looks to be pretty straightforward, so you could apply it in batches ie

update TableToUpdate setEndTimeKey = DATE_NOfrom Dates where EndTime = DATE
where TableToUpdateId between 1 and 100000

That will break your update into manageable size chunks - at the very least you'll get an idea how long each chunk will take.

Another option is putting an index on the EndTime column, potentially it's having to do a full table scan.

The real answer though is to look at the query plan being generated. As you can see there are many reasons why a query may run slow - these are just some quick ones to check.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文