使用 Java EE 检索大量对象列表
是否有一种普遍接受的方法来使用 Java EE 返回大量对象列表?
例如,如果您有一个包含数百万个对象的数据库 ResultSet,您将如何将这些对象返回到(远程)客户端应用程序?
另一个例子(更接近我实际做的事情)是聚合来自数百个来源的数据,对其进行规范化,然后将其作为单个“列表”增量传输到客户端系统。
由于所有数据都无法放入内存中,因此我认为将有状态的 SessionBean 和某种回调到服务器的自定义迭代器相结合就可以解决问题。
所以,换句话说,如果我有一个像 Iterator这样的 API getData()
那么实现 getData() 和 Iterator
的好方法是什么?
您过去是如何成功解决这个问题的?
Is there a generally-accepted way to return a large list of objects using Java EE?
For example, if you had a database ResultSet that had millions of objects how would you return those objects to a (remote) client application?
Another example -- that is closer to what I'm actually doing -- would be to aggregate data from hundreds of sources, normalize it, and incrementally transfer it to a client system as a single "list".
Since all the data cannot fit in memory, I was thinking that a combination of a stateful SessionBean and some sort of custom Iterator that called back to the server would do the trick.
So, in other words, if I have an API like Iterator<Data> getData()
then what's a good way to implement getData()
and Iterator<Data>
?
How have you successfully solved this problem in the past?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
绝对不要将整个数据库复制到 Java 内存中。这毫无意义,只会使事情变得不必要的缓慢和占用内存。而是在数据库级别引入分页。您应该只查询您实际需要在当前页面上显示的数据,就像 Google 所做的那样。
如果您实际上很难正确实现此操作和/或计算特定数据库的 SQL 查询,请查看 这个答案。对于 JPA/Hibernate 等效项,请查看 这个答案。
更新根据评论(这实际上改变了整个问题主题...),这是一个基本(伪)启动示例:
这样您实际上会在 Java 内存中得到一个条目,而不是整个集合如以下(低效)示例所示:
Definitely don't duplicate the entire DB into Java's memory. This makes no sense and only makes things unnecessarily slow and memory-hogging. Rather introduce pagination at database level. You should query only the data you actually need to display on the current page, like as Google does.
If you actually have a hard time in implementing this properly and/or figuring the SQL query for the specific database, then have a look at this answer. For JPA/Hibernate equivalent, have a look at this answer.
Update as per the comments (which actually changes the entire question subject...), here's a basic (pseudo) kickoff example:
This way you effectively end up with a single entry in Java's memory instead of the entire collection as in the following (inefficient) example:
使用基于 Web 的用户界面时,分页是一个很好的解决方案。然而,有时,在一次调用中传输所有内容会更高效。 rmiio 库是专门为此目的而编写的,并且已知可以在各种应用程序服务器中工作。
Pagination is a good solution when working with a web based ui. sometimes, however, it is much more efficient to stream everything in one call. the rmiio library was written explicitly for this purpose, and is already known to work in a variety of app servers.
如果您的列表很大,您必须假设它无法容纳在内存中。或者至少,如果您的服务器需要在许多并发访问中处理该问题,那么您就有很高的 OutOfMemoryException 风险。
所以基本上,你所做的就是分页和使用批量读取。假设您从数据库加载 1000 个对象,并将它们发送到客户端请求响应。然后循环直到处理完所有对象。 (请参阅 BalusC 的回复)
客户端的问题是相同的,您可能需要将数据流式传输到文件系统以防止 OutOfMemory 错误。
另请注意:可以从数据库加载数百万个对象作为管理任务:例如执行备份和导出某些“特殊”情况。但您不应该将其用作任何用户都可以执行的请求。它会很慢并且会耗尽服务器资源。
If your list is huge, you must assume that it can't fit in memory. Or at least that if your server need to handle that on many concurrent access then you have high risk of OutOfMemoryException.
So basically, what you do is paging and using batch reading. let say you load 1 thousand objects from your database, you send them to the client request response. And you loop until you have processed all objects. (See response from BalusC)
Problem is same on client side, and you'll likely to need to stream the data to the file system to prevent OutOfMemory errors.
Please also note : It is okay to load millions of object from a database as an administrative task : like for performing a backup, and export of some 'exceptional' case. But you should not use it as a request any user could do. It will be slow and drain server resources.