如何使用 hibernate 正确迭代数据库记录

发布于 2024-12-22 13:21:38 字数 754 浏览 8 评论 0原文

我想迭代数据库中的记录并更新它们。然而，由于更新既需要一些时间又容易出错，所以我需要a）不要让数据库等待（例如使用ScrollableResults）和b）在每次更新后提交。第二件事是，这是在多个线程中完成的，因此我需要确保如果线程 A 正在处理一条记录，则线程 B 正在获取另一条记录。我怎样才能用 hibernate 明智地实现这一点？

为了给出更好的想法，以下代码将由多个线程执行，其中所有线程共享 RecordIterator 的单个实例：

Iterator<Record> iter = db.getRecordIterator();
while(iter.hasNext()){
    Record rec = iter.next();
    // do something lengthy here
    db.save(rec);
}

所以我的问题是如何实现 RecordIterator 。如果在每个 next() 上执行查询，如何确保不会两次返回相同的记录？如果不这样做，使用哪个查询来返回分离的对象？一般方法是否存在缺陷（例如，每个线程使用一个 RecordIterator 并让数据库以某种方式处理同步）？附加信息：许多记录可以通过多种方式在本地保存（例如，在一组已处理的记录中）。

更新：由于整个过程需要一些时间，因此记录的状态可能会发生变化。因此，查询结果的顺序可能会发生变化。我想为了解决这个问题，我必须在返回记录进行处理后在数据库中标记它们......

原文

I want to iterate over records in the database and update them. However since that updating is both taking some time and prone to errors, I need to a) don't keep the db waiting (as e.g. with a ScrollableResults) and b) commit after each update.
Second thing is that this is done in multiple threads, so I need to ensure that if thread A is taking care of a record, thread B is getting another one.
How can I implement this sensibly with hibernate?

To give a better idea, the following code would be executed by several threads, where all threads share a single instance of the RecordIterator:

Iterator<Record> iter = db.getRecordIterator();
while(iter.hasNext()){
    Record rec = iter.next();
    // do something lengthy here
    db.save(rec);
}

So my question is how to implement the RecordIterator. If on every next() I perform a query, how to ensure that I don't return the same record twice? If I don't, which query to use to return detached objects? Is there a flaw in the general approach (e.g. use one RecordIterator per thread and let the db somehow handle synchronization)? Additional info: there are way to many records to locally keep them (e.g. in a set of treated records).

Update: Because the overall process takes some time, it can happen that the status of Records changes. Due to that the ordering of the result of a query can change. I guess to solve this problem I have to mark records in the database once I return them for processing...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦旅人picnic 2024-12-29 13:21:39

我的建议是，由于您正在共享主迭代器的一个实例，因此使用共享的 Hibernate 事务来运行所有线程，在开始时进行一次加载，在最后进行一次大的保存。您将所有数据加载到一个“Set”中，您可以使用线程对其进行迭代（注意锁定，因此您可能需要为每个线程拆分一个部分，或者以某种方式管理共享资源，这样您就不会“ t 重叠）。

Hibernate 解决方案的优点在于，记录不会立即保存到数据库（因为您正在使用事务），而是存储在 Hibernate 的缓存中。然后最后它们都会被立即写回数据库。这将节省您担心的那些昂贵的数据库写入，而且它为您提供了在每次迭代时使用的实际对象，而不仅仅是数据库行。

我在您的更新中看到记录的状态在处理过程中可能会发生变化，这总是会导致问题。如果这是一个不断运行的过程或长时间运行，那么我使用休眠解决方案的建议是在较小的集合中工作，是的，添加一个标志来标记已更新的记录，以便当您移动到下一个集合时可以捡起那些没有被触及的。

回复收藏 0 原文