ERP 的最佳默认事务隔离级别是多少(如果有)?
我们刚刚开始使用 Hibernate 将 ERP 系统迁移/重新实现到 Java,目标是使用该系统的并发用户数为 50-100 个。 我们使用MS SQL Server作为数据库服务器,这足以满足负载。
旧系统不使用事务,并且依赖于设置手动锁定(使用标志)并释放它们的关键部分(例如库存变化)。 这类似于手动事务管理。 但有时会出现数据不一致的问题。 在新系统中,我们希望使用交易来消除这些问题。
假设 OLTP 和 OLAP 的使用率分别为 85% 和 15%,对于 ERP 系统来说,什么是好的/合理的默认事务隔离级别?
或者我应该始终根据每个任务决定使用哪个事务级别?
四种事务隔离级别:
- READ UNCOMMITTED
- READ COMMITTED
- REPEATABLE READ
- SERIALIZABLE
We are just starting to migrate/reimplement an ERP system to Java with Hibernate, targeting a concurrent user count of 50-100 users using the system. We use MS SQL Server as database server, which is good enough for the load.
The old system doesn't use transactions and relies for critical parts (e.g. stock changes) on setting manual locks (using flags) and releasing them. That's something like manual transaction management. But there are sometimes problems with data inconsistency. In the new system, we would like to use transactions to wipe out these problems.
What would be a good/reasonable default transaction isolation level to use for an ERP system, given a usage of about 85% OLTP and 15% OLAP?
Or should I always decide, on a per task basis, which transaction level to use?
The four transaction isolation levels:
- READ UNCOMMITTED
- READ COMMITTED
- REPEATABLE READ
- SERIALIZABLE
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
100 次中有 99 次,读已提交是正确的答案。 这确保您只能看到其他会话已提交的更改(因此,假设您已正确设计事务,结果是一致的)。 但它不会带来可重复读取或可序列化带来的锁定开销(特别是在非 Oracle 数据库中)。
偶尔,您可能希望运行一个报告,您愿意为了速度而牺牲准确性并设置读取未提交的隔离级别。 这很少是一个好主意,但有时它是锁定争用问题的合理可接受的解决方法。
当您的进程需要在整个运行过程中查看一组一致的数据(无论其他事务当时正在做什么)时,偶尔会使用可序列化和可重复读取。 将月末对账流程设置为可序列化可能是合适的,例如,如果有大量过程代码,用户可能会在流程运行时进行更改,并且要求流程需要确保它始终看到协调开始时存在的数据。
99 times out of 100, read committed is the right answer. That ensures that you only see changes that have been committed by the other session (and, thus, results that are consistent, assuming you've designed your transactions correctly). But it doesn't impose the locking overhead (particularly in non-Oracle databases) that repeatable read or serializable impose.
Very occasionally, you may want to run a report where you are willing to sacrifice accuracy for speed and set a read uncommitted isolation level. That's rarely a good idea, but it is occasionally a reasonably acceptable workaround to lock contention issues.
Serializable and repeatable read are occasionally used when you have a process that needs to see a consistent set of data over the entire run regardless of what other transactions are doing at the time. It may be appropriate to set a month-end reconciliation process to serializable, for example, if there is a lot of procedureal code, a possibility that users are going to be making changes while the process is running and a requirement that the process needs to ensure that it is always seeing the data as it existed at the time the reconciliation started.
不要忘记 SNAPSHOT,它位于 SERIALIZABLE 的正下方。
这取决于报告中数据准确的重要性。 这确实是一个逐个任务的事情。
Don't forget about SNAPSHOT, which is right below SERIALIZABLE.
It depends on how important it is for the data to be accurate in the reports. It really is a task-by-task thing.
这实际上很大程度上取决于您如何设计应用程序,简单的答案就是在 READ_COMMITTED 运行。
您可以提出这样的论点:如果您在设计系统时考虑到这一点,则可以使用 READ_UNCOMMITTED 作为默认值,并且仅在需要时才增加隔离级别。 无论如何,绝大多数事务都会成功,因此读取未提交的数据并不是什么大问题。
隔离级别影响查询的方式取决于您的目标数据库。 例如,与 Oracle 等数据库相比,Sybase 和 MSSQL 等数据库在运行 READ_COMMITTED 时必须锁定更多资源。
It really depends a lot on how you design your application, the easy answer is just run at READ_COMMITTED.
You can make an argument that if you design your system with it in mind that you could use READ_UNCOMMITTED as the default and only increase the isolation level when you need it. The vast majority of your transactions are going to succeed anyway so reading uncommitted data won't be a big deal.
The way isolation levels effect your queries depends on your target database. For instance databases like Sybase and MSSQL must lock more resources when you run READ_COMMITTED, than databases like Oracle.
对于 SQL Server(可能还有大多数主要的 RDBMS),我会坚持使用默认值。 对于 SQL Server,这是 READ COMMITTED。 如果多了,就会开始给数据库带来过重的负担;如果少了,就会遇到一致性问题。
For SQL Server (and probably most major RDBMS), I'd stick with the default. For SQL Server, this is READ COMMITTED. Anything more and you start overtaxing the DB, anything less and you've got consistency issues.
Read Uncommissed 绝对是大多数论坛中的弱势群体。 然而,使用它的理由超出了经常指出的“速度与准确性”的问题。
假设您有:
当读已提交时,上述事务只有在提交后才会释放。 那么你可能会遇到这样的情况:T1正在等待T2释放A,而T2正在等待T1释放B。这里两个事务在锁中发生冲突。
您可以重写这些过程来避免这种情况(例如:始终按字母顺序获取资源!)。 尽管如此,由于并发用户太多和数万行代码,这个问题可能会变得非常有可能而且很难诊断和解决。
另一种方法是使用未提交的读取。 然后,您设计事务时假设可能存在脏读。 我个人认为这个问题比连锁火车残骸更局部化和更容易治疗。
脏读的问题可以通过
(1) 回滚来解决:不可以。 这应该是仅在出现硬件故障、网络故障或程序崩溃时的最后一道防线。
(2) 使用应用锁创建运行的锁定机制
在更高的抽象级别,每个锁更接近于
现实世界的资源或操作。
Read Uncommitted is definitely the underdog in most forums. However, there are reasons to use it that go beyond a matter of "speed versus accuracy" that is often pointed out.
Let's say you have:
With read committed, the transactions above won't release until committing. Then you can run into a situation where T1 is waiting for T2 to release A, and T2 is waiting for T1 to release B. Here the two transactions collide in a lock.
You could re-write those procedures to avoid this scenario (example: acquire resources always in alphabetical order!). Still, with too many concurrent users and tens of thousands of lines of code, this problem may become both very likely and very difficult to diagnose and resolve.
The alternative is using Read Uncommitted. Then you design your transactions assuming that there may be dirty reads. I personally find this problem much more localized and treatable than the interlocking trainwrecks.
The issues from dirty reads can be preempted by
(1) Rollbacks: don't. This should be the last line of defense in case of hardware failure, network failure or program crash only.
(2) Use application locks to create a locking mechanism that operates
at a higher level of abstraction, where each lock is closer to a
real-world resource or action.