SQL死锁问题

发布于 2024-09-10 09:15:46 字数 566 浏览 5 评论 0原文

在关系数据库中这两条语句有可能死锁吗？我试图简化我的问题和示例 - 请假设这些选择（我认为通常只需要可共享读锁定）现在需要独占读锁定：

Concurrent Connection 1:
SELECT {...}
FROM A 
JOIN B ON {...}

Concurrent Connection 2:
SELECT {...}
FROM B 
JOIN A ON {...}

也就是说，连接的顺序重要吗？ SQL 中的单个语句是原子的吗？第一个语句中是否先锁定 A，然后锁定 B，以及在第二个语句中先锁定 B，然后锁定 A？

我认为不会——我的直觉告诉我，像这样的两个单一语句不会陷入僵局，无论多么复杂。我相信语句是作为一个整体进行分析的，并且需要锁定的资源是使用某种确定性全局顺序（即按字母顺序）锁定的。但我需要的不仅仅是直觉——我想不出一种方法来证明这一点，也找不到它的记录。

我对 MS SQL 2005 感兴趣，但我认为问题不是特定于实现的。

其次：由于它与 MS SQL 相关，我还想知道公共表表达式也有这样的保证 - CTE 主要是语法优势（+递归），由引擎合并为传统的单个语句。

原文

Is it possible in relational databases for these two statements to deadlock? I'm trying to simplify my question and example -- please just assume that these selects, which I think would normally only require sharable read-locking, now require exclusive read locks:

Concurrent Connection 1:
SELECT {...}
FROM A 
JOIN B ON {...}

Concurrent Connection 2:
SELECT {...}
FROM B 
JOIN A ON {...}

That is, does the ordering of the joins matter? Are single statements in SQL atomic? Is A locked first and then B in the first statement and B locked first and then A in the second statement?

I think not - My gut tells me that two single statements like this cannot deadlock, no matter how complex. I believe that a statement is analyzed as a whole and that the resources requiring locking are locked using some deterministic global order (i.e. alphabetically). But I need more than a gut feeling on this - I can't think of a way to prove it and I can't find it documented.

I'm interested in MS SQL 2005, but I don't think the question is implementation specific.

Secondarily: As it relates to MS SQL, I'd also want to know that Common Table Expressions also have this guarantee - that CTEs are mostly a syntactic benefit (+recursion), consolidated into a traditional single statement by the engine.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

难忘№最初的完美 2024-09-17 09:15:46

SELECT 不能与其他 SELECT 发生死锁，因为它们只获取共享锁。你说我们应该考虑这些 SELECT 现在“需要独占读锁”，但这对我们来说是不可能考虑的，因为 1) 不存在独占读锁这样的东西，2) 读取不要获取独占锁。

但你确实提出了一个更普遍的问题，简单的语句是否会陷入僵局。答案是明确的、响亮的是。锁是在执行时获取的，而不是预先分析并排序，然后按某种顺序获取。引擎不可能预先知道所需的锁，因为它们依赖于磁盘上的实际数据，并且读取引擎需要的数据......锁定数据。

由于不同的索引访问顺序而导致的简单语句（SELECT 与 UPDATE 或 SELECT 与 DELETE）之间的死锁非常常见，并且很容易调查、诊断和修复。但请注意，总是涉及写入操作，因为读取不能互相阻塞。对于本次讨论，向 SELECT 添加 UPDLOCK 或 XLOCK 提示应被视为写入。您甚至不需要 JOIN，二级索引很可能会引入导致死锁的访问顺序问题，请参阅读/写死锁。

最后，编写 SELECT FROM A JOIN B 或编写 SELECT FROM B JOIN A 是完全无关的。查询优化器可以自由地重新排列它认为合适的访问顺序，查询的实际文本不会以任何方式强加执行顺序。

已更新

那么我们如何构建一个通用的
READ COMMITTED 策略
“多实体”数据库
不会死锁吗？

恐怕没有千篇一律的食谱。解决方案将视具体情况而定。最终，在数据库应用程序中，死锁是不争的事实。我知道这可能听起来很荒谬，就像“我们登上了月球，但我们无法编写正确的数据库应用程序”一样，但有一些强有力的因素在起作用，这些因素几乎保证了应用程序最终会遇到死锁。幸运的死锁是最容易处理错误的，简单地再次读取状态，应用逻辑，重新写入新状态。话虽如此，有一些好的做法可以显着降低死锁的频率，甚至使死锁几乎消失：

尝试为写入采用一致的访问模式。有明确定义的规则，说明诸如“交易应始终按以下顺序表：客户 ->” OrderHeaders -> 订单行。'请注意，必须在事务内遵守该顺序。基本上，对架构中的所有表进行排名，并指定所有更新必须按排名顺序进行。这最终归结为编写代码的个人贡献者的代码纪律，因为它必须确保其写入是按照事务内的正确顺序进行更新的。
减少写入的持续时间。通常的智慧是这样的：在事务开始时进行所有读取（读取现有状态），然后处理逻辑并计算新值，然后在事务结束时写入所有更新。避免像“读->写->逻辑->读->写”这样的模式，而是“读->读->逻辑->写->写”。当然，真正的技巧在于如何处理实际的、真实的、个别的情况，而显然必须在交易过程中进行写入。这里必须特别注意特定类型的事务：由队列驱动的事务，根据定义，它们通过从队列中出队（=写入）来启动其活动。这些应用程序总是非常难以编写并且容易出错（特别是死锁），幸运的是有一些方法可以做到这一点，请参阅使用表作为队列。
减少读取量。表扫描是导致死锁的最常见原因。正确的索引不仅可以消除死锁，还可以提高该过程的性能。
快照隔离。在避免僵局方面，这是最接近免费午餐的事情。我故意把它放在最后，因为它可能掩盖其他问题（例如不正确的索引）而不是修复它们。

尝试使用 LockCustomerByXXX 方法解决此问题恐怕行不通。悲观锁定无法扩展。乐观并发更新是必经之路如果你想有任何不错的表现。

SELECTs cannot deadlock with other SELECT, because they only acquire shared locks. You say that we should consider that these SELECTs now 'require exclusive read locks', but this is not possible for us to consider because 1) there is no such thing as an exlusive read lock and 2) reads don't acquire exclusive locks.

But you do pose a more general question, whether simple statements can deadlock. The answer is a definite, resounding YES. Locks are acquired at execution, not analyzed upfront and sorted then acquired in some order. It would be impossible for the engine to know upfront the needed locks because they depend on the actual data in on-disk, and to read the data the engine needs to ... lock the data.

Deadlocks between simple statements (SELECt vs. UPDATE or SELECT vs. DELETE) due to different index access order are quite common and very easy to investigate, diagnose and fix. But note that there is always a write operation involved, as reads cannot block each other. For this discussion, adding a UPDLOCK or XLOCK hint to a SELECT should be considered a write. You don't even need a JOIN, a secondary index may well introduce the access order problem leading to deadlock, see Read/Write Deadlock.

And finally, writing SELECT FROM A JOIN B or writing SELECT FROM B JOIN A is completely irrelevant. The query optimizer is free to rearrange the access order as it sees fit, the actual text of the query does not impose the order of execution in any way.

Updated

How then can we construct a general
strategy toward a READ COMMITTED
"multiple entity" database that
doesn't deadlock?

I'm afraid there is no cookie-cutter recipe. The solution will depend from case to case. Ultimately, in database applications deadlocks are a fact of life. I understand this may sound absurd, as in 'we landed on the Moon but we can't write a correct database application', but there are strong factors at play which pretty much guarantee that applications will eventually encounter deadlocks. Lucky deadlocks are the easiest to deal with errors, simple read again the state, apply the logic, re-write the new state. Now that being said, there are some good practices that can dramatically reduce the frequency of deadlocks, down to the point they are all but vanished:

Try to have a consistent access pattern for Writes. Have clearly defined rules stating things such as 'a transaction shall always tables in this order: Customers -> OrderHeaders -> OrderLines.' Note that the order has to be obeyed inside a transaction. Basically, rank all tables in your schema and specify that all updates must occur in ranking order. This eventually boils down to code discipline of the individual contributor writing the code, as it has to ensure it writes is update sin the proper order inside a transaction.
Reduce the duration of writes. The usual wisdom goes as this: at the beginning of the transaction do all the reads (read the existing state), then process the logic and compute new values, then write all updates at the end of transaction. Avoid a pattern like 'read->write->logic->read->write', instead do 'read->read->logic->write->write'. Of course, the true craftsmanship consist in how to deal with actual, real, individual cases when apparently one must have to do writes mid-transaction. A special note here must be said about a specific type of transaction: those driven by a queue, which by very definition start their activity by dequeueing (= a write) from the queue. These applications were always notoriously difficult to write and prone to errors (specially deadlocks), luckily there are ways to do it, see Using tables as Queues.
Reduce the amount of reads. Table scans are the most prevalent cause of deadlocks. Proper indexing will not only eliminate the deadlocks, but may also boost performance in the process.
Snapshot isolation. This is the closest thing you'll get to a free lunch in regard to avoiding deadlocks. I intentionally put it last, because it may mask other problems (like improper indexing) instead of fixing them.

Trying to solve this problem with a LockCustomerByXXX approach I'm afraid doesn't work. Pessimistic locking doesn't scale. Optimistic concurrency updates are the way to go if you want to have any sort of decent performance.

回复收藏 0 原文