数据库事务是否可以防止竞争条件?
我并不完全清楚数据库系统中的事务是做什么的。我知道它们可以用来完全回滚更新列表(例如,从一个帐户中扣除资金并将其添加到另一个帐户中),但这就是它们所做的一切吗?具体来说,它们可以用来防止竞争条件吗?例如:(
// Java/JPA example
em.getTransaction().begin();
User u = em.find(User.class, 123);
u.credits += 10;
em.persist(u); // Note added in 2016: this line is actually not needed
em.getTransaction().commit();
我知道这可能会被编写为单个更新查询,但情况并非总是如此)
此代码是否受到竞争条件的保护?
我最感兴趣的是 MySQL5 + InnoDB,但也欢迎一般答案。
It's not entirely clear to me what transactions in database systems do. I know they can be used to rollback a list of updates completely (e.g. deduct money on one account and add it to another), but is that all they do? Specifically, can they be used to prevent race conditions? For example:
// Java/JPA example
em.getTransaction().begin();
User u = em.find(User.class, 123);
u.credits += 10;
em.persist(u); // Note added in 2016: this line is actually not needed
em.getTransaction().commit();
(I know this could probably be written as a single update query, but that's not alway the case)
Is this code protected against race conditions?
I'm mostly interested in MySQL5 + InnoDB, but general answers are welcome too.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
TL/DR:事务本质上不会阻止所有竞争条件。在所有实际数据库实现中,您仍然需要锁定、中止和重试处理或其他保护措施。 交易不是您可以添加到查询中的秘密武器它们不受所有并发影响。
隔离
您的问题所要表达的是ACID 中的I - 隔离。纯粹的学术想法是事务应该提供完美的隔离,以便结果与每个事务串行执行的结果相同。事实上,在实际的 RDBMS 实现中很少出现这种情况;功能因实现而异,并且可以通过使用较弱的隔离级别(例如
READ COMMITTED
)来削弱规则。实际上,您不能假设事务会阻止所有竞争条件,即使是在“SERIALIZABLE”隔离下也是如此。一些 RDBMS 的能力比其他 RDBMS 更强。例如,PostgreSQL 9.2 及更高版本具有相当好的
SERIALIZABLE
隔离,可以检测事务之间大多数(但不是全部)可能的交互,并中止除其中一个冲突事务之外的所有事务。因此它可以非常安全地并行运行事务。很少有系统(如果有的话3)具有真正完美的
SERIALIZABLE
隔离,可以防止所有可能的竞争和异常,包括锁升级和锁排序死锁等问题。即使有强大的隔离,一些系统(如 PostgreSQL)也会中止冲突的事务,而不是让它们等待并串行运行它们。您的应用程序必须记住它正在做什么并重新尝试事务。因此,虽然事务阻止了将并发相关的异常存储到数据库中,但它的存储方式对应用程序来说并不透明。
原子性
可以说,数据库事务的主要目的是提供原子提交。在您提交事务后,更改才会生效。当您提交时,就其他事务而言,所有更改都会同时生效。任何事务都无法仅看到事务所做的1,2部分更改。类似地,如果您
ROLLBACK
,那么该事务的任何更改都不会被任何其他事务看到;就好像您的交易从未存在过一样。这就是ACID 中的A。
耐用性
另一个是耐用性 - ACID 中的 D。它指定当您提交事务时,它必须真正保存到存储中,以便能够在断电或突然重新启动等故障中幸存下来。
一致性:
参见wikipedia
乐观并发控制
而不是使用锁定和/或高隔离级别,Hibernate、EclipseLink 等 ORM 很常见使用 乐观并发控制(通常称为“乐观锁定”),以克服较弱隔离级别的限制,同时保持性能。
这种方法的一个关键特征是,它允许您跨多个事务进行工作,这对于用户数量较多且与任何给定用户的交互之间可能存在较长延迟的系统来说是一个很大的优势。
参考资料
除了文本链接之外,请参阅 有关锁定的 PostgreSQL 文档章节、隔离和并发。即使您使用不同的 RDBMS,您也会从它解释的概念中学到很多东西。
1为了简单起见,我在这里忽略了很少实现的
READ UNCOMMITTED
隔离级别;它允许脏读。2正如 @meriton 指出的那样,推论不一定正确。 幻读发生在
SERIALIZABLE
以下的任何内容中。正在进行的事务的一部分看不到任何更改(通过尚未提交的事务),然后当其他事务处理时,正在进行的事务的下一部分确实看到更改承诺。3 嗯,IIRC SQLite2 在尝试写入时锁定整个数据库,但这并不是我所说的并发问题的理想解决方案。
TL/DR: Transactions do not inherently prevent all race conditions. You still need locking, abort-and-retry handling, or other protective measures in all real-world database implementations. Transactions are not a secret sauce you can add to your queries to make them safe from all concurrency effects.
Isolation
What you're getting at with your question is the I in ACID - isolation. The academically pure idea is that transactions should provide perfect isolation, so that the result is the same as if every transaction executed serially. In reality that's rarely the case in real RDBMS implementations; capabilities vary by implementation, and the rules can be weakened by use of a weaker isolation level like
READ COMMITTED
. In practice you cannot assume that transactions prevent all race conditions, even atSERIALIZABLE
isolation.Some RDBMSs have stronger abilities than others. For example, PostgreSQL 9.2 and newer have quite good
SERIALIZABLE
isolation that detects most (but not all) possible interactions between transactions and aborts all but one of the conflicting transactions. So it can run transactions in parallel quite safely.Few, if any3, systems have truly perfect
SERIALIZABLE
isolation that prevents all possible races and anomalies, including issues like lock escalation and lock ordering deadlocks.Even with strong isolation some systems (like PostgreSQL) will abort conflicting transactions, rather than making them wait and running them serially. Your app must remember what it was doing and re-try the transaction. So while the transaction has prevented concurrency-related anomalies from being stored to the DB, it's done so in a manner that is not transparent to the application.
Atomicity
Arguably the primary purpose of a database transaction is that it provides for atomic commit. The changes do not take effect until you commit the transaction. When you commit, the changes all take effect at the same instant as far as other transactions are concerned. No transaction can ever see just some of the changes a transaction makes1,2. Similarly, if you
ROLLBACK
, then none of the transaction's changes ever get seen by any other transaction; it's as if your transaction never existed.That's the A in ACID.
Durability
Another is durability - the D in ACID. It specifies that when you commit a transaction it must truly be saved to storage that will survive a fault like power loss or a sudden reboot.
Consistency:
See wikipedia
Optimistic concurrency control
Rather than using locking and/or high isolation levels, it's common for ORMs like Hibernate, EclipseLink, etc to use optimistic concurrency control (often called "optimistic locking") to overcome the limitations of weaker isolation levels while preserving performance.
A key feature of this approach is that it lets you span work across multiple transactions, which is a big plus with systems that have high user counts and may have long delays between interactions with any given user.
References
In addition to the in-text links, see the PostgreSQL documentation chapter on locking, isolation and concurrency. Even if you're using a different RDBMS you'll learn a lot from the concepts it explains.
1I'm ignoring the rarely implemented
READ UNCOMMITTED
isolation level here for simplicity; it permits dirty reads.2As @meriton points out, the corollary isn't necessarily true. Phantom reads occur in anything below
SERIALIZABLE
. One part of an in-progress transaction doesn't see some changes (by a not-yet-committed transaction), then the next part of the in-progress transaction does see the changes when the other transaction commits.3 Well, IIRC SQLite2 does by virtue of locking the whole database when a write is attempted, but that's not what I'd call an ideal solution to concurrency issues.
数据库层在不同程度上支持事务的原子性,称为隔离级别。检查数据库管理系统的文档,了解支持的隔离级别及其权衡。最强的隔离级别,可序列化,要求事务像逐个执行一样执行。这通常是通过在数据库中使用排它锁来实现的。这可能会导致死锁,数据库管理系统会通过回滚一些涉及的事务来检测并修复死锁。这种方法通常被称为悲观锁定。
许多对象关系映射器(包括 JPA 提供程序)还支持乐观锁定,其中更新冲突不会在数据库中被阻止,而是在应用程序层中检测到,然后回滚事务。如果启用了乐观锁定,示例代码的典型执行将发出以下 sql 查询:
假设返回 (123, 13, 100)。
数据库告诉我们更新了多少行。如果是的话,则不存在冲突的更新。如果它为零,则发生冲突的更新,JPA 提供程序将执行
并抛出异常,以便应用程序代码可以处理失败的事务,例如通过重试。
摘要:无论采用哪种方法,您的语句都可以免受竞争条件的影响。
The database tier supports atomicity of transactions to varying degrees, called isolation levels. Check the documentation of your database management system for the isolation levels supported, and their trade-offs. The strongest isolation level, Serializable, requires transactions to execute as if they were executed one by one. This is typically implemented by using exclusive locks in the database. This can be cause deadlocks, which the database management system detects and fixes by rolling back some involved transactions. This approach is often referred to as pessimistic locking.
Many object-relational mappers (including JPA providers) also support optimistic locking, where update conflicts are not prevented in the database, but detected in the application tier, which then rolls back the transaction. If you have optimistic locking enabled, a typical execution of your example code would emit the following sql queries:
Let's say this returns (123, 13, 100).
The database tells us how many rows where updated. If it was one, there was no conflicting update. If it was zero, a conflicting update occurred, and the JPA provider will do
and throw an exception so application code can handle the failed transaction, for instance by retrying.
Summary: With either approach, your statement can be made safe from race conditions.
它取决于隔离级别(在可序列化中,它将防止竞争条件,因为通常在可序列化隔离级别中,事务按顺序处理,而不是并行处理(或者至少使用独占锁定,因此修改相同行的事务在顺序)。
为了防止竞争条件,最好手动锁定记录(例如mysql支持'select ... for update'语句,该语句需要对所选记录进行写锁定)
It depends on isolation level (in serializable it will prevent race condition, since generally in serializable isolation level transactions are processed in sequence, not in paralell (or at least exclusive locking is used, so transactions, that modify the same rows, are performed in sequence).
In order to prevent the race condition, better manually lock the record (mysql for example supports 'select ... for update' statement, which aquires write-lock on the selected records)
这取决于具体的rdbms。通常,事务会根据查询评估计划期间的决定获取锁。有的可以请求表级锁,有的可以请求列级锁,有的可以请求记录级锁,出于性能考虑优先选择第二个。对你的问题的简短回答是肯定的。
换句话说,事务旨在对一组查询进行分组并将它们表示为原子操作。如果操作失败,更改将回滚。我不太清楚您使用的适配器是做什么的,但如果它符合事务的定义,您应该没问题。
虽然这保证了防止竞争条件,但它并不能明确防止饥饿或死锁。事务锁管理器负责此工作。有时会使用表锁,但它们会因减少并发操作数量而付出高昂的代价。
It depends on the specific rdbms. Generally, transactions acquire locks as decided during the query evaluation plan. Some can request table level locks, other column level, other record level, the second is preferred for performance. The short answer to your question is yes.
In other words, a transaction is meant to group a set of queries and represent them as an atomic operation. If the operation fails the changes are rolledback. I don't exactly know what the adapter you're using does, but if it conforms to the definition of transactions you should be fine.
While this guarantees prevention of race conditions, it doesn't explicitly prevent starvation or deadlocks. The transaction lock manager is in charge of that. Table locks are sometime used, but they come with a hefty price of reducing the number of concurrent operations.