我正在研究数据库中潜在的并发问题,所以我去阅读了。我发现 http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/c0005267.htm 和它提到了对未提交数据的访问。
访问未提交的数据。
应用程序 A 可能会更新以下值
数据库,应用程序 B 可能
之前读取该值
坚定的。那么,如果 A 的值为
不是后来承诺,而是退出,
B 执行的计算是
基于未承诺的(大概是
无效)数据。
什么......我认为其他会话(相同的应用程序,甚至相同的线程)可以读取尚未提交的数据?我认为只有将数据写入未提交事务的连接/会话(我不确定我的术语)可以读取未提交的数据。
其他线程真的可以读取尚未提交的数据吗?
我打算使用 mysql 但我可能会使用 sqlite
I was looking at potential concurrency issues in DB so i went to read up. I found http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/c0005267.htm and it mentions access to uncommitted data.
Access to uncommitted data.
Application A might update a value in
the database, and application B might
read that value before it was
committed. Then, if the value of A is
not later committed, but backed out,
the calculations performed by B are
based on uncommitted (and presumably
invalid) data.
What... i thought other sessions (same app and even same thread) can read data that has not been committed yet? I thought only the connection/session (i am not sure of my terminology) that wrote the data into the uncommitted transaction can read uncommitted data.
Can other threads really read data that hasnt been committed?
I plan to use mysql but i may use sqlite
发布评论
评论(4)
其他会话可以读取的内容取决于您如何设置数据库。在 MySQL 中,这还取决于您使用的数据库引擎。您要查找的术语(在 ANSI SQL 术语中)是“隔离级别”。
许多数据库将默认设置一个隔离级别,其中未提交数据的读取将被阻止。因此,如果事务 A 更新表 T 中的记录 1234,然后事务 B 尝试在 A 提交或回滚之前选择记录 1234,则 B 将阻塞,直到 A 执行其中一项操作。
请参阅 MySQL 事务,第二部分 - 事务隔离级别。
这样做的一个严重缺点是,长时间运行的事务中的批量更新操作(通常)可能会阻止许多请求。
您还可以设置它,以便 B 看到未提交的数据,但这通常是不明智的。
或者,您可以使用名为 MVCC(“多版本并发控制”)的方案,这将给出不同的事务基于事务开始时间的一致数据视图。这避免了未提交的读取问题(读取可能回滚的数据)并且更具可扩展性,特别是在长期事务的上下文中。
MySQL 支持 MVCC。
What other sessions can read depends on how you set up your database. In MySQL it also depends on what database engine you use. The term you're looking for (in ANSI SQL terms) is "isolation level".
Many databases will default to an isolation level where reads on uncommitted data will block. So if transaction A updates record 1234 in table T and then transaction B tries to select record 1234 before A commits or rolls back then B will block until A does one of those things.
See MySQL Transactions, Part II - Transaction Isolation Levels.
One serious downside of this is that batch update operations that live in long-running transactions (typically) can potentially block many requests.
You can also set it so B will see uncommitted data but that is often ill-advised.
Alternatively you can use a scheme called MVCC ("Multiversion concurrency control"), which will give different transactions a consistent view of the data based on the time the transaction started. This avoids the uncommitted read problem (reading data that may be rolled back) and is much more scalable, especially in the context of long-lived transactions.
MySQL supports MVCC.
当然在 SQL Server 中你可以,你必须选择这样做,这不是默认的,但是如果你使用正确的隔离级别或查询提示,你可以选择读取未提交的行,这可能会导致问题,甚至双重理论上读取同一行。
Certainly in SQL Server you can, you have to chose to do it, it is not the default, but if you use the right isolation level or query hint you can chose to read an uncommitted row, this can leads to problems and even a double read of the same row in theory.
该文章提到对未提交数据的访问是数据库管理器消除的问题之一。
MySQL 的 InnoDB 存储引擎支持多种事务隔离级别。详细信息请参见
http://dev.mysql.com/doc/refman/ 5.4/en/set-transaction.html。
That article mentions access to uncommitted data as one of the problems eliminated by the database manager.
MySQL's InnoDB storage engine supports several transaction isolation levels. For details, see
http://dev.mysql.com/doc/refman/5.4/en/set-transaction.html.
对于某些数据库的某些版本,将查询设置为能够读取未提交的数据将提高性能,因为减少了锁定。但这仍然留下了安全性、可靠性和可扩展性的问题有待解答。
具体来说,我曾经在一个非常大的电子商务网站上工作。他们在读取存储目录时使用未提交读,因为数据被大量访问,很少更改,并且对读取未提交数据的担忧不敏感。无论如何,用于下订单的目录中的任何数据都将被重新验证。这是在 SQL Server 2000 上进行的,众所周知,它存在锁定性能问题。在较新版本的 SQL Server 上,锁定性能已得到改进,因此没有必要这样做。
For some versions of some databases, setting queries to be able to read uncommitted will improve performance, because of reduced locking. That still leaves questions of security, reliability, and scalability to be answered.
To give a specific, I used to work on a very large e-commerce site. They used read uncommitted on reads to the store catalog, since the data was heavily accessed, infrequently changed, and not sensitive to concerns about reading uncommitted data. Any data from the catalog that was used to place an order would be re-verified anyway. This was on SQL Server 2000, which was known to have locking performance problems. On newer versions of SQL Server, the locking performance has improved, so this wouldn't be necessary.