“已提交读”和“已提交读”之间的区别和“可重复读取”在 SQL Server 中
我认为上面的隔离级别非常相似。有人可以用一些很好的例子来描述主要区别是什么吗?
I think the above isolation levels are so alike. Could someone please describe with some nice examples what the main difference is?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
已提交读是一种隔离级别,可保证读取时已提交任何数据读取。它只是限制读者看到任何中间的、未提交的“脏”读物。它不承诺如果事务重新发出读取,将找到相同数据,数据在读取后可以自由更改。
可重复读是一种更高的隔离级别,除了保证读已提交级别之外,还保证读到的任何数据不能改变,如果事务再次读取相同的数据,就会发现先前读取的数据就位,未更改且可供读取。
下一个隔离级别(可序列化)提供了更强大的保证:除了所有可重复读取保证之外,它还保证后续读取无法看到新数据。
假设您有一个表 T,其中包含 C 列,其中包含一行,假设它的值为“1”。假设您有一个如下所示的简单任务:
这是一个从表 T 中发出两次读取的简单任务,两次读取之间有 1 分钟的延迟。
如果您遵循上面的逻辑,您很快就会意识到,SERIALIZABLE 事务虽然可以让您的生活变得轻松,但总是完全阻止每个可能的并发操作,因为它们要求任何人都不能修改、删除或插入任何行。 .Net System.Transactions 作用域的默认事务隔离级别是可序列化的,这通常可以解释所导致的糟糕性能。
最后,还有 SNAPSHOT 隔离级别。 SNAPSHOT 隔离级别提供与可串行化相同的保证,但不要求任何并发事务都可以修改数据。相反,它迫使每个读者看到自己版本的世界(自己的“快照”)。这使得编程变得非常容易,并且非常可扩展,因为它不会阻止并发更新。然而,这种好处是有代价的:额外的服务器资源消耗。
补充阅读:
Read committed is an isolation level that guarantees that any data read was committed at the moment is read. It simply restricts the reader from seeing any intermediate, uncommitted, 'dirty' read. It makes no promise whatsoever that if the transaction re-issues the read, will find the Same data, data is free to change after it was read.
Repeatable read is a higher isolation level, that in addition to the guarantees of the read committed level, it also guarantees that any data read cannot change, if the transaction reads the same data again, it will find the previously read data in place, unchanged, and available to read.
The next isolation level, serializable, makes an even stronger guarantee: in addition to everything repeatable read guarantees, it also guarantees that no new data can be seen by a subsequent read.
Say you have a table T with a column C with one row in it, say it has the value '1'. And consider you have a simple task like the following:
That is a simple task that issue two reads from table T, with a delay of 1 minute between them.
If you follow the logic above you can quickly realize that SERIALIZABLE transactions, while they may make life easy for you, are always completely blocking every possible concurrent operation, since they require that nobody can modify, delete nor insert any row. The default transaction isolation level of the .Net
System.Transactions
scope is serializable, and this usually explains the abysmal performance that results.And finally, there is also the SNAPSHOT isolation level. SNAPSHOT isolation level makes the same guarantees as serializable, but not by requiring that no concurrent transaction can modify the data. Instead, it forces every reader to see its own version of the world (its own 'snapshot'). This makes it very easy to program against as well as very scalable as it does not block concurrent updates. However, that benefit comes with a price: extra server resource consumption.
Supplemental reads:
可重复读取
数据库的状态从事务开始时就得到维护。如果您在 session1 中检索一个值,然后在 session2 中更新该值,在 session1 中再次检索该值将返回相同的结果。读取是可重复的。
已提交读
在事务上下文中,您将始终检索最近提交的值。如果您在 session1 中检索一个值,在 session2 中更新它,然后再次在 session1 中检索它,您将获得在 session2 中修改的值。它读取最后提交的行。
有道理吗?
Repeatable Read
The state of the database is maintained from the start of the transaction. If you retrieve a value in session1, then update that value in session2, retrieving it again in session1 will return the same results. Reads are repeatable.
Read Committed
Within the context of a transaction, you will always retrieve the most recently committed value. If you retrieve a value in session1, update it in session2, then retrieve it in session1again, you will get the value as modified in session2. It reads the last committed row.
Makes sense?
根据我对这个线程的阅读和理解,简单的答案是基于这个简单的场景:
有两个事务 A 和 B。它们按以下顺序执行以下操作。
Simply the answer according to my reading and understanding to this thread and @remus-rusanu answer is based on this simple scenario:
There are two transactions A and B. They perform the following operations in this following sequence.
老问题已经有一个公认的答案,但我喜欢考虑这两个隔离级别如何改变 SQL Server 中的锁定行为。这对于那些像我一样正在调试死锁的人可能会有所帮助。
READ COMMITTED(默认)
共享锁在 SELECT 中获取,然后当 SELECT 语句完成时释放。这就是系统如何保证未提交数据不会被脏读的原因。在 SELECT 完成之后和事务完成之前,其他事务仍然可以更改基础行。
可重复读取
共享锁在 SELECT 中获取,然后仅在事务完成后释放。这就是系统如何保证您读取的值在事务期间不会更改(因为它们在事务完成之前保持锁定状态)。
Old question which has an accepted answer already, but I like to think of these two isolation levels in terms of how they change the locking behavior in SQL Server. This might be helpful for those who are debugging deadlocks like I was.
READ COMMITTED (default)
Shared locks are taken in the SELECT and then released when the SELECT statement completes. This is how the system can guarantee that there are no dirty reads of uncommitted data. Other transactions can still change the underlying rows after your SELECT completes and before your transaction completes.
REPEATABLE READ
Shared locks are taken in the SELECT and then released only after the transaction completes. This is how the system can guarantee that the values you read will not change during the transaction (because they remain locked until the transaction finishes).
尝试用简单的图表来解释这个疑惑。
Trying to explain this doubt with simple diagrams.
这里还有其他答案,但它们没有提供有关底层数据库体系结构的任何详细信息,这使得很难理解事务隔离级别为何以它们的方式运行以及解决了哪些问题。
并发环境中常见问题的概述
数据库系统允许多个并发连接。这会导致在其他并发系统中出现相同类型的问题。例如,在多线程环境中,互斥锁可以防止对内存的并发访问,从而解决可能导致数据损坏或无效的竞争条件问题。
同样,由于数据库系统允许并发 CRUD 操作(更新、插入、删除、选择),因此多个连接的并发操作可能会导致不良的观察行为。
请注意,数据库行操作的原子性可防止彻底的数据损坏或不一致,因此始终强制执行基本级别的事务隔离。
有关详细信息,请参阅 ACID。 (原子性、一致性、隔离性、持久性。)简短的解释是,在每行的基础上,操作是原子的。这意味着通过防止出现一种情况来防止数据损坏:在另一个连接通过将其数据部分写入同一行而损坏该数据之前,一个连接会写入部分行数据。 (这对于熟悉多线程环境的人来说会更直观。)
上述问题与多线程编程中出现的问题类似,其中一个线程开始写入内存块,然后另一个线程到来并将其数据部分写入同一内存块。在第一个线程完成之前阻塞。这会导致数据不一致。
首先了解行操作的原子性质非常重要,因为这已经提供了基本级别的保护。
事务隔离级别的类型
我们将了解以下事务隔离级别,这些级别在 MariaDB 和许多其他 SQL 数据库实现中可用。
我们首先需要知道不同的隔离级别是什么:
事务隔离级别解决了哪些问题?
在解释这些不同选项的作用之前,了解每个选项解决的问题非常重要。以下是潜在不良行为的列表。
脏读:
数据库操作通常组合在一起形成一个事务。然后,该事务要么作为一组操作提交到数据库,要么执行回滚以丢弃该组操作。
如果一个连接作为事务启动一系列操作,然后第二个连接开始从同一个表读取数据,则第二个连接可以从已提交的数据库中读取数据。 ,或者它也可以读取作为开放但未提交的事务的一部分所做的更改。
这定义了已提交的读和未提交的读之间的区别。
这在概念上是不寻常的,因为读取未提交的数据通常没有多大意义。事务的全部目的是确保数据库中的数据不会以这种方式更改。
总结一下:
奇怪
。如果读操作模式设置为“读已提交”,则可能会出现重复读。这种模式解决了脏读的问题,因为只读取已提交的数据。
可能的写操作有:
当读操作读取一组行时,就会发生不可重复读,然后重复该操作,返回同一组行(相同的键),但返回非键数据已经改变了。
例如:
where
子句)选择的。where
或其他过滤器子句中的选择标准定义。幻读
幻读是不可重复读的扩展。
插入
或删除
操作可能会更改返回行集中返回的行。插入操作可以将新行添加到返回的行集中。删除操作可能会执行相反的操作,并从返回的行集中删除行。总结一下:
隔离级别
鉴于我们对潜在不良行为的理解,隔离级别的作用如下:
更高级别的隔离需要“锁定”数据库中的更多数据以防止并发访问。这可能是不可取的,因为如果 DMBS 持有整个表的锁,则其他连接都无法修改该数据。这可能会导致需要访问数据库的其他进程挂起。
提交读通常是最明智的选择。它确保您(数据库管理员)只能看到已提交的数据(持久数据,而不是瞬态数据),并且不会导致其他进程挂起。
参考书目
进一步阅读:
维基百科 ACID 操作
MariaDB 中的隔离级别
Geeks for Geeks 隔离级别。 (请注意,有些信息没有任何意义,例如对已提交读的解释,它指出这会导致提交读取的任何未提交的数据。这是不正确的,也没有意义。未提交的数据是仅通过显式提交操作提交。)
不可重复读取与幻像读取< /a>
There are other answers here, but they don't give any details about the underlying database architecture, which makes it difficult to understand why the transaction isolation levels function the way they do, and what problems are solved.
General Overview of Common Problems in Concurrent Environments
Database systems permit multiple concurrent connections. This leads to the same kinds of problems seen in other concurrent systems. For example, in multithreaded environments, mutexes prevent concurrent access to memory, thus solving the problem of race conditions which can lead to corrupt or invalid data.
In a similar way, because database systems permit concurrent CRUD operations (update, insert, delete, select), concurrent operations by multiple connections can lead to undesirable observed behaviour.
Note that the atomicity of database row operations prevents outright data corruption or inconsistency, so there is always a base level of transaction isolation enforced at all times.
For more information on this, see ACID. (Atomic, Consistent, Isolation, Durability.) The short explanation is that on a per-row basis, operations are atomic. This means prevents data corruption by preventing a situation whereby one connection would write part of a rows data, before another connection corrupted that data by partially writing its data to the same row. (This will be more intuitive to those familiar with multithreaded environments.)
The above described problem is analagous to problem seen in multithreaded programming whereby one thread begins writing to a block of memory, and then another thread comes and partially writes its data to the same block, before the first thread is finished. This results in inconsistent data.
It is important to understand the atomic nature of row operations first, because this already provides a base level of protection.
Types of Transaction Isolation Levels
We will look at the following Transaction Isolation levels, whcih are available in MariaDB and many other SQL database implementations.
We first need to know what the different isolation levels are:
What problems are solved by the Transaction Isolation Levels?
Before explaining what these different options do, it is important to understand the problems solved by each of them. Here is a list of potential unwanted behaviours.
Dirty Read:
Database operations are often grouped together into a transaction. This transaction is then either committed to the database as a group of operations, or a rollback is perfomed to discard the group of operations.
If one Connection starts a series of operations as a transaction, and then a second Connection begins reading data from the same table(s), the second Connection could either read the data from the database which has been committed, or it could also read the changes made as part of the open, and yet uncommitted, transaction.
This defines the difference between read committed and read uncommitted.
This is conceptually unusual, because it often doesn't make much sense to read uncommitted data. The whole point of transactions is to ensure data in the database does not change in this way.
To summarize:
Non-Repeatable Read
Given the above, a Non-Repeatable read may occur if the read operation mode is set to "Read Committed". This mode solves the problem of Dirty Read because only committed data is read.
The possible write operations are:
A Non-Repeatable Read occurs when a read operation reads a set of rows, and then the operation is repeated and the same set of rows (same keys) are returned, but the non-key data has changed.
For example:
where
clause.where
, or other filter, clause.Phantom Reads
Phantom Reads are an extension of the Non-Repeatable read. An
insert
ordelete
operation may change the rows returned in the set of returned rows. An insert operation may add new rows to the set of returned rows. A delete operation may do the opposite and remove rows from the set of returned rows.To summarize:
Isolation Levels
Given our understanding of the potential undesirable behaviours, this is what the isolation levels do:
Higher levels of isolation require more data in the database to be "locked" preventing concurrent access. This can be undesirable, because if a DMBS is holding a lock over a whole table, then no other connections can modify that data. This might cause other processes which need access to the database to hang.
Read Committed is typically the most sensible choice. It ensures that you, the Database Administrator, see only the committed data (data which is persistent, not transient) and it will not cause other processes to hang.
Bibliography
Further reading:
Wikipedia ACID Operations
Isolation Levels in MariaDB
Geeks for Geeks Isolation Levels. (Be aware some of the information doesn't make any sense, for example the explanation of Read Committed, which states that this causes a commit of any uncommitted data which is read. That is not correct and does not make sense. Uncommitted data is only committed by an explicit commit operation.)
Non-Repeatable Read vs Phantom Read
我认为这张图也很有用,当我想快速记住隔离级别之间的差异时,它可以帮助我作为参考(感谢 kudvenkat 在 YouTube 上)
I think this picture can also be useful, it helps me as a reference when I want to quickly remember the differences between isolation levels (thanks to kudvenkat on youtube)
请注意,可重复读取中的可重复针对的是元组,而不是整个表。在 ANSC 隔离级别中,可能会出现幻读异常,这意味着使用相同的 WHERE 子句读取表两次可能会返回不同的结果集。从字面上看,它是不可重复的。
Please note that, the repeatable in repeatable read regards to a tuple, but not to the entire table. In ANSC isolation levels, phantom read anomaly can occur, which means reading a table with the same WHERE clause twice may return different result sets. Literally, it's not repeatable.
我对最初接受的解决方案的观察。
在 RR 下(默认 mysql) - 如果一个 tx 打开并且已触发 SELECT,则另一个 tx 无法删除属于先前 READ 结果集的任何行,直到提交前一个 tx(事实上,新 tx 中的删除语句将挂起) ,但是下一个交易可以毫无问题地从表中删除所有行。顺便说一句,前一个 tx 中的下一个 READ 仍然会看到旧数据,直到它被提交。
My observation on initial accepted solution.
Under RR (default mysql) - If a tx is open and a SELECT has been fired, another tx can NOT delete any row belonging to previous READ result set until previous tx is committed (in fact delete statement in the new tx will just hang), however the next tx can delete all rows from the table without any trouble. Btw, a next READ in previous tx will still see the old data until it is committed.