捕获:java.lang.OutOfMemoryError:Java 堆空间 - 使用 -Xmx 不适用
我用 Groovy 编写了一个非常复杂的数据库迁移脚本,它在我的工作站上运行得很好,但在服务器的 JVM 上运行时会产生“Caught: java.lang.OutOfMemoryError: Java heap space”。 JVM 按原样卡住(作为实习生,资源有限),因此除了增加可用内存之外,我还需要找出另一种方法来解决此问题。
当访问一些最大的表时,就会出现错误:特别大但简单的联接(200,000 多行到 50,000 多行)。有没有另一种方法可以实现这样的连接,从而使我免于错误?
查询示例:
target.query("""
SELECT
a.*, b.neededColumn
FROM
bigTable a JOIN mediumTable b ON
a.stuff = b.stuff
ORDER BY stuff DESC
""") { ResultSet rs ->
...
}
I have written a very complex database migration script in Groovy, that runs just fine on my workstation but produces "Caught: java.lang.OutOfMemoryError: Java heap space" when run on the server's JVM. JVM is stuck as is (limited resources as an intern), so I need to figure out another way to fix this besides increasing available memory.
The error strikes when some of the largest tables are accessed: a particularly large, but simple, join (200,000+ rows to 50,000+ rows). Is there another way I can approach such a join that will save me from the error?
Example of query:
target.query("""
SELECT
a.*, b.neededColumn
FROM
bigTable a JOIN mediumTable b ON
a.stuff = b.stuff
ORDER BY stuff DESC
""") { ResultSet rs ->
...
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以在数据库服务器上运行 SQL 中的联接吗?
如果没有,您可能会不得不迭代 200,000 个结果中的每一个,将其连接到 50,000 行并写出结果(因此您在任何时候都不会在内存中存储超过 1*50,000 个结果)
或者,如果如果您可以访问多台机器,您可以将 200,000 个项目分成多个块并在每台机器上执行一个块?
编辑
以您的示例代码为例,您应该能够执行以下操作:
这会将每一行写入文件
output.csv
Can you run the join in SQL on the database server?
If not, you're probably stuck with iterating through each of your 200,000 results joining it to the 50,000 rows and writing out the results (so you aren't storing more than 1*50,000 results in memory at any one time)
Or, if you have access to multiple machines, you could divide your 200,000 items into blocks and do one block per machine?
Edit
Taking your example code, you should be able to do:
That will write each row out to the file
output.csv
您必须更改代码,以便行不会同时全部加载到内存中(即流式传输数据,一次处理每一行)。据我所知,当您使用像
collect
这样的东西时,Groovy 仍然不会这样做,所以重写它以使用 for 循环。You have to change your code so that the rows are not loaded all into memory at the same time (i.e. stream the data, work on each row one at a time). As far as I know, Groovy still doesn't do this when you use things like
collect
, so rewrite it to use a for loop.