SQL 事务影响大量行
情况如下:
一个大型生产客户端/服务器系统,其中一个中央数据库表的某一列曾将 NULL 作为默认值,但现在将 0 作为默认值。但是,在该更改之前创建的所有行当然仍然具有 null 值,并且会在该系统中生成许多不必要的错误消息。
解决方案当然很简单:
update theTable set theColumn = 0 where theColumn is null
但我想完成这笔交易需要很长时间?除此之外,在这样做之前我还应该考虑其他问题吗?这个大事务会在整个更新过程中阻塞整个数据库或特定表吗?
这个特定的表大约有 550k 行,其中 500k 行具有空值,将受到上述 sql 语句的影响。
The situation is as follows:
A big production client/server system where one central database table has a certain column that has had NULL as default value but now has 0 as default value. But all the rows created before that change of course still have value as null and that generates a lot of unnecessary error messages in this system.
Solution is of course simple as that:
update theTable set theColumn = 0 where theColumn is null
But I guess it's gonna take a lot of time to complete this transaction? Apart from that, will there be any other issues I should think of before I do this? Will this big transaction block the whole database, or that particular table during the whole update process?
This particular table has about 550k rows and 500k of them has null value and will be affected by the above sql statement.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对其他连接的客户端的性能的影响取决于:
数据库引擎将获取写锁,因此当您的客户只需要对表进行读取访问时,这应该不是一个大问题。
500.000 条记录对我来说听起来并不算太多,但正如我所说,更新所需的时间和资源取决于许多因素。
您有类似的测试系统可以尝试更新吗?
另一种解决方案是将一个大更新拆分为许多小更新并循环调用它们。
当您的客户端频繁写入该表时,您的更新语句可能会“永远”被阻止。我见过数据库中逐行执行更新是完成更新的唯一方法。但这是一个包含大约 200.000.000 条记录和大约 500 个非常活跃客户的表!
The impact on the performance of other connected clients depends on:
The db engine will acquire write locks, so when your clients only need read access to the table, it should not be a big problem.
500.000 records sounds not too much for me, but as i said, the time and resources the update takes depends on many factors.
Do you have a similar test system, where you can try out the update?
Another solution is to split the one big update into many small ones and call them in a loop.
When you have clients writing frequently to that table, your update statement might get blocked "forever". I have seen databases where performing the update row by row was the only way of getting the update through. But that was a table with about 200.000.000 records and about 500 very active clients!
没有明确的说法。很大程度上取决于硬件、并发会话的数量、表是否有锁、相互依赖的触发器的数量等。
如果“整个数据库”依赖于该表,那么它可能会阻塞。
如果表已被其他事务锁定 - 您可能会遇到行锁情况。在极少数情况下,可能会出现死锁情况。最好的办法是确保没有人在使用该表,检查是否有任何预先存在的锁,然后运行该语句。
there's no definite way to say this. Depends a lot on the hardware, number of concurrent sessions, whether the table has got locks, the number of interdependent triggers et al.
If the "whole database" is dependent on this table then it might.
If the table has been locked by other transaction - you might run into a row-lock situation. In rare cases, perhaps a dead lock situation. Best would be to ensure that no one is utilizing the table, check for any pre-exising locks and then run the statement.
锁定问题是特定于供应商的。
假设表上没有触发器,即使表上有很多索引,对于专用数据库服务器来说,50 万行也不算多。
Locking issues are vendor specific.
Asuming no triggers on the table, half a million rows is not much for a dediated database server even with many indexes on the table.