使用多核的数据库 JDBC 与隔离级别开销

发布于 2024-11-15 06:13:45 字数 376 浏览 2 评论 0原文

你好,

我想使用 JDBC 将数据存入具有原生 WAL 的多核系统上的数据库中。我正在考虑在我的应用程序中生成多个线程以并行插入数据。

如果应用程序有多个线程,我必须将隔离级别提高到可重复读取,MVCC 数据库上的隔离级别应映射到快照隔离。

如果我使用一个线程,我就不需要隔离级别。据我所知,大多数快照隔离数据库都会分析所有可能发生冲突的事务的写入集,然后回滚除真正冲突事务之一之外的所有事务。更具体地说,我说的是 Oracle、InnoDB 和 PostgreSQL。

1.) 这种对写入集的分析成本昂贵吗?

2.) 多线程插入以获得更高的总吞吐量是个好主意吗?真正的冲突几乎是不可能的,因为应用程序层为线程提供无冲突的东西。但数据库应该是一个安全网。

Hallo,

I want to get data into a database on a multicore system with ative WAL using JDBC. I was thinking about spawning multiple threads in my application to insert data parallely.

If the application has multiple threads I will have to increase the isolation level to Repeatable Read which on MVCC-databases should be mapped to Snapshot isolation.

If I were using one thread I wouldn't need isolation levels. As far as I know most Snapshot isolation databases analyze the write sets of all transaction that could have a conflict and then rollback all but one of the real conflict transactions. More specific I'm talking about Oracle, InnoDB and PostgreSQL.

1.) Is this analyzing of the write sets expensive?

2.) Is it a good idea to multithread the inserts for a higher total throughput? Real conflict are nearly impossible because of the application layer feeding the threads conflict free stuff. But the database shall be a safety net.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

红颜悴 2024-11-22 06:13:46

Oracle 不支持可重复读。它仅支持已提交读和可序列化。我可能是错的,但为 Oracle 设置可重复读取的隔离级别可能会导致事务具有可序列化的隔离级别。简而言之,您只能依靠数据库对所需隔离级别的支持。

我不能代表 InnoDB 和 PostgreSQL,但如果它们不支持所需的隔离级别,同样适用。数据库可以自动将隔离级别升级到更高级别,以满足所需的隔离特性。如果您的应用程序所需的隔离级别必须是可重复读取,您应该重新考虑这种方法。

您正确推断的问题是,如果检测到冲突,乐观锁定可能会导致事务回滚。 Oracle 通过报告 ORA-08177 SQL 错误来做到这一点。由于当两个线程访问相同数据范围时会报告此错误,因此如果线程处理涉及不同数据范围的数据集,则可以避免该错误。在跨线程划分工作时,您必须确保这种情况。

Oracle does not support Repeatable Read. It supports only Read Committed and Serializable. I might be mistaken, but setting an isolation level of Repeatable Read for Oracle might result in a transaction with an isolation level of Serializable. In short, you are left to mercy of the database support for the isolation levels that you desire.

I cannot speak for InnoDB and PostgreSQL, but the same would apply if they do not support the required isolation levels. The database could automatically upgrade the isolation level to a higher level to meet the desired isolation characteristics. You ought to rethink this approach, if your application's desired isolation level has to be Repeatable Read.

The problem like you've rightly inferred is that optimistic locking will possibly result in transaction rollbacks, if a conflict is detected. Oracle does so by reporting the ORA-08177 SQL error. Since this error is reported when two threads will access the same data range, it could be avoided if the threads work against data sets involving different data ranges. You will have to ensure that this is the case when dividing work across threads.

樱娆 2024-11-22 06:13:46

我认为这里的限制因素是磁盘 IO,而不是转向可重复读取的开销。

即使是单个线程也可能会最大化数据库服务器上的磁盘,尤其是插入/更新时所需的数据库日志记录量。您确定情况还不是这样吗?

另外,在任何多用户系统中,您可能都希望以可重复读取隔离的方式运行(Postgres 仅支持此功能和可序列化)。因此,我不认为这会在我通常看到的之上增加任何“开销”。

I think the limiting factor here will be disk IO, not the overhead of moving to Repeatable Read.

Even a single thread may be able to max out the disks on the DB server especially with the amount of DB logging required on insert / update. Are you sure that's not already the case?

Also, in any multi-user system, you probably want to be running with Repeatable Read isolation anyway (Postgres only supports this and serializable). So, I don't think of this as adding any "overhead" above what I would normally see.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文