何时使用 spring/hibernate 为批处理作业启动新会话/事务以及何时提交/刷新会话的最佳实践?
我在春季设置了一个 tx-advice 来围绕我的 Service 方法包装事务。因此,在我的批处理类中,我调用一个服务方法来加载对象列表并将其返回到我的批处理类。然后在我的批处理类中,我调用一个服务方法来处理每个对象。但是,如果该服务方法尝试访问对象的延迟加载属性,我将收到延迟加载异常,因为该对象列表是使用不同的休眠会话加载的。
因此,解决这个问题的一种可能不是最佳的方法是 - 批处理类只是调用一个服务来加载这些对象的所有 ID(长值) - 然后我们将此 ID 传递给一个服务方法,该方法将从DB通过ID然后对其进行处理。
对此有何想法?
我遇到的另一个问题是,这些对象中的每一个是否彼此独立,我是否应该一次保留每个对象,而不是一次保留所有对象或批量保存它们。如果有 5000 条记录,那么在调用保存/更新/插入时应用程序似乎会减慢很多,因为它仍在 Hibernate 会话的内存中执行所有这些操作。但是,如果我改为保存/更新/插入每条记录(一次处理一个 ID),然后在完成该对象后提交,然后再转到下一个,那么速度似乎会加快很多。另外,如果我批量处理,比如每 200 条,甚至一次全部处理 5000 条,如果一条记录插入/更新失败并出现错误,则不会保留任何内容,并且所有内容都会回滚。
处理此类事情的最佳实践是什么?看起来确实是很常见的事情。谢谢
There's a tx-advice i have set in spring to wrap a transaction around my Service methods. So say in my batch class, i call a service method to load a list of objects and return it to my batch class. Then in my batch class i call a service method to process each of those objects. But i'll receive a lazy loading exception if that service method tries to access a lazy loaded property of the object since that list of objects was loaded with a different hibernate session.
So a way around this which may not be the most optimal is - the batch class just calls a service to load all the IDs of those objects (long values) - and we pass this ID to a service method which will load that object from the DB by the ID and then do the processing on it.
Thoughts on this?
Another question I had was if each of these objects are independent of each other, should or should i not persist each object one at a time vs persisting them all at once or batch it. If theres 5000 records, it seems like the application slows down a lot when calling save/update/insert since it's still doing all that in memory in the Hibernate Session. But if I instead save/update/insert each record (processing one ID at a time) and then committing when done w/ that object before going to the next, it seems to speed up a lot. Also if i batch it, say every 200 or even do all 5000 at once, if one record fails to insert/update and gets an error, nothing will get persisted and everything rolls back.
What are the best practices for handling things like this? Seems like something really common. Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,Spring/Hibernate 并不是真正用于批处理的。相反,请查看 Talend 或 Pentaho(如果您热衷于开源),或任何种类繁多的商业工具。这些工具中的任何一个都可以用来自动生成一堆 Java 代码,这些代码将完全满足您的需要(包括插入优化、优雅的错误处理等)。
好吧,让我们假设您真的非常想让 Spring/Hibernate 进行批处理。您有几个不同的问题 - 首先,Hibernate 会话生命周期意味着加载的对象期望与实时会话关联。您可以使用会话flush()强制将更改传播到数据库。会话 close() 将清除所有内容。已经加载的对象只能很困难地重新附加到新会话(通常重新加载对象更容易)。如果您不 close()/flush() 您的会话,最终您将(可能)耗尽内存。您可以通过添加 Hibernate 二级缓存来解决这个问题...但这只会使事情变得更加复杂并减慢速度。
没有真正的理由不只在独立的 Hibernate 会话中执行每个插入(打开、执行工作、关闭)。它不会像专用工具那么快,但它很简单,运行良好,并且或多或少与您所获得的一样好。
First of all, Spring/Hibernate are not really intended for batch processing. Instead, check out either Talend or Pentaho (if you are into open source), or any of a huge (massive!) variety of commercial tools. Either of these tools can be used to automatically generate a lump of Java code that will do exactly what you need (including insert optimization, elegant error handling, etc).
Ok, let's assume that you really, really want to make Spring/Hibernate do batch processing. You have a couple of different issues - first, the Hibernate session lifecycle means that objects loaded expect to be associated to the live session. You can use the session flush() to force the changes to propagate to the database. Session close() will wipe everything out. Objects that are already loaded can only be reattached to a new session with difficulty (usually it's easier to just reload the object). If you don't close()/flush() your session, eventually you will (probably) run out of memory. You can fix that by adding a Hibernate 2nd level cache... but that will just make things more complex and slow it down.
There is no real reason not to just do each insertion within an independent Hibernate session (open, do work, close). It won't be as fast as a dedicated tool, but it's simple, will work fine, and is more or less as good as you'll get.
关于批处理要求,请使用Spring批处理链接< /a>
这提供了所需的所有必要的配料设施。
关于对象加载问题,
似乎是正确的。
with regards to the batching requirment, please use Spring batching link
this provides all the necessary batching facilities needed.
Regarding the object loading issue,
seems correct.