数据库表因无效查询而损坏:它做了什么并且会回滚吗?
我犯了一个错误,就是对我的事实表(200M 行)运行本应快速更新的内容:
update dbo.primary_fact
set count_of_loan_obligors = o.n
from dbo.staging_fact f
-- notice that this is not the same table as the one in the update clause
inner join ##Obligor_Count o
on (f.time_dimension_id = o.time_dimension_id
and f.account_dimension_id = o.account_dimension_id)
它应该是:
from dbo.primary_fact f
像这样的正确格式的更新(1 天,87k 个帐户)通常会在一到 2 分钟内完成。跑了 12 分钟后,我想知道为什么花了这么长时间,并发现了我的错误。
我取消了 SQL Server Management Studio 中的查询,我知道这将回滚我造成的所有糟糕情况(有人可以确认吗?)
但我更大的问题是:格式不正确的查询会做什么?
更新:取消操作终于在 1 小时 39 分钟后完成。 DBA 的杀戮速度太慢了——这也是好事。
正确形成的更新在 8 秒内完成。
第二次更新:在 SSMS 中成功取消订单后,原始(错误)更新中没有设置任何值。我认为这意味着任何待处理的更新都被回滚。
I made the error of running what should have been a quick update against my fact table (200M rows) with this:
update dbo.primary_fact
set count_of_loan_obligors = o.n
from dbo.staging_fact f
-- notice that this is not the same table as the one in the update clause
inner join ##Obligor_Count o
on (f.time_dimension_id = o.time_dimension_id
and f.account_dimension_id = o.account_dimension_id)
It should have been:
from dbo.primary_fact f
A correctly formed update like this (1 day, 87k accounts) typically finishes in a minute or 2. After running for 12 minutes, I wondered what was taking so long and spotted my error.
I cancelled the query in SQL Server Management Studio which I understand will roll back all the awful I caused (can someone confirm?)
But my bigger question is: What does the incorrectly formed query do?
Update: The cancel action finally completed, an hour and 39 minutes later. DBAs were too slow on the kill -- just as well.
Properly formed update finished in 8 seconds.
Second Update: There were no values set from the original (faulty) update following the successful cancel order in SSMS. I would interpret this to mean that any pending updates were rolled back.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
它会将所有行的
dbo.primary_fact.count_of_loan_obligors
更新为相同的值。该值将是##Obligor_Count.n
中的某个值。很难弄清楚那会有什么价值。下面是一个基本执行您所做操作的小测试:
结果:
在本例中,
@T1
已更新为@T2
中的第一行。It will update
dbo.primary_fact.count_of_loan_obligors
with the same value for all rows. The value will be some value from##Obligor_Count.n
. It is hard to figure out what value that would be.Here is a little test that basically does what you did:
Result:
In this case
@T1
is updated with the first row in@T2
.好吧,首先,您可能可以使用正确的 from 子句而不是 staging_fact 表重新运行查询,并且它将覆盖您所做的任何错误。这是好消息,也是使用事实表的乐趣。
坏消息是,根据我的经验,SSMS 不会回滚任何内容,除非您实际上在事务中运行它,因此您的数据现在可能会失败。
希望您享受生命中的最后 12 分钟,因为您即将再次享受这一切的乐趣。
Well, first of all it looks like you could probably just re-run your query with the correct from clause instead of the staging_fact table, and it will overwrite any booboos you made. That's the good news, and the joy of working with fact tables.
Bad news is that from my experience, SSMS doesn't roll anything back unless you actually run it within a transaction, so your data is probably a big bucket of fail right now.
Hope you enjoyed that last 12 minutes of your life, because you're about to have the pleasure of doing it again.
我担心这两个版本都没有任何好处,因为它们都缺少一个 WHERE 子句,该子句将正在更新的表与新值的源连接起来(f 内连接 o)。我希望行
WHERE Primary_fact.time_dimension_id=f.time_dimension_id AND Primary_fact.account_dimension_id=f.account_dimension_id
被删除在复制/粘贴中。只要引用的表具有这些名称的列,
f
和o
的连接就能正常执行。然后这些值将用于更新 Primary_fact,无论是使用 WHERE 子句,还是以某种我不知道的方式。 UPDATE/FROM 语法不是标准 SQL,但得到广泛支持。也许 SQL SERVER 甚至添加了默认的 WHERE 子句。 PostgreSQL 没有。I worry that neither version does anything good, because they are both missing a WHERE clause connecting the table being updated with the source of the new values (f inner join o). I expect the line
WHERE primary_fact.time_dimension_id=f.time_dimension_id AND primary_fact.account_dimension_id=f.account_dimension_id
got dropped in the copy/paste.As long as the referenced tables have the columns of those names, the join of
f
ando
will be performed just fine. Then those values will be used to update primary_fact, either with the WHERE clause, or in some way I don't know. The UPDATE/FROM syntax is not standard SQL, but it is widely supported. Maybe SQL SERVER even adds in a default WHERE clause. Postgresql doesn't.