OutOfMemoryError异常:Java堆空间,如何调试...?
我收到 java.lang.OutOfMemoryError 异常:Java 堆空间。
我正在解析一个 XML 文件,存储数据并在解析完成后输出一个 XML 文件。
我对收到这样的错误感到有点惊讶,因为原始 XML 文件根本不长。
I'm getting a java.lang.OutOfMemoryError exception: Java heap space.
I'm parsing a XML file, storing data and outputting a XML file when the parsing is complete.
I'm bit surprised to get such error, because the original XML file is not long at all.
Code: http://d.pr/RSzp
File: http://d.pr/PjrE
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
简短的回答解释为什么你有一个 OutOfMemoryError ,对于文件中找到的每个质心,你循环遍历已经“注册”的质心以检查它是否已知(添加一个新的质心或更新已经注册的质心)。但对于每次失败的比较,您都会添加新质心的新副本。因此,对于每个新的质心,它都会添加它,次数与列表中已有的质心一样多,然后您遇到添加的第一个质心,更新它并离开循环...
这是一些重构的代码:
Short answer to explain why you have an OutOfMemoryError, for every centroid found in the file you loop over the already "registered" centroids to check if it is already known (to add a new one or to update the already registered one). But for every failed comparison you add a new copy of the new centroid. So for every new centroid it add it as many times as there are already centroids in the list then you encounter the first one you added, you update it and you leave the loop...
Here is some refactored code:
可以尝试在 eclipse.ini 文件中将 -Xms 和 -Xmx 值设置得更高(我假设您使用 Eclipse)。
ex)
-vmargs
-Xms128m //(初始堆大小)
-Xmx256m //(最大堆大小)
Could try setting the (I'm assuming your using Eclipse) -Xms and -Xmx values higher in your eclipse.ini file.
ex)
-vmargs
-Xms128m //(initial heap size)
-Xmx256m //(max heap size)
如果这是您只想完成的一次性事情,我会尝试 Jason 的建议,即增加 Java 可用的内存。
您正在构建一个非常大的对象列表,然后循环该列表以输出一个字符串,然后将该字符串写入文件。列表和字符串可能是内存使用率高的原因。您可以以更加面向流的方式重新组织代码。在开始时打开文件输出,然后在解析每个质心时编写 XML。那么您就不需要保留它们的大列表,也不需要保存表示所有 XML 的大字符串。
If this is a one-off thing that you just want to get done, I'd try Jason's advice of increasing the memory available to Java.
You are building a very large list of objects and then looping through that list to output a String, then writing that String to a file. The list and the String are probably the reasons for your high memory usage. You could reorganise your code in a more stream-oriented way. Open your file output at the start, then write the XML for each Centroid as they are parsed. Then you wouldn't need to keep a big list of them, and you wouldn't need to hold a big String representing all the XML.
转储堆并对其进行分析。您可以使用 -XX:+HeapDumpOnOutOfMemoryError 系统属性配置内存错误时的自动堆转储。
http://www.oracle.com/technetwork/java/javase/ index-137495.html
https ://www.infoq.com/news/2015/12/OpenJDK-9-removal-of-HPROF-jhat
http://blogs.oracle.com/alanb/entry/heap_dumps_are_back_withDump the heap and analyze it. You can configure automatic heap dump on memory error using
-XX:+HeapDumpOnOutOfMemoryError
system property.http://www.oracle.com/technetwork/java/javase/index-137495.html
https://www.infoq.com/news/2015/12/OpenJDK-9-removal-of-HPROF-jhat
http://blogs.oracle.com/alanb/entry/heap_dumps_are_back_with回答“如何调试”问题
首先要收集帖子中缺少的信息。可能会帮助未来遇到同样问题的人的信息。
首先,完整的堆栈跟踪。 XML 解析器中引发的内存不足异常与代码中引发的内存不足异常非常不同。
其次,XML文件的大小,因为“一点也不长”是完全没用的。是1K、1M还是1G?有多少个元素。
第三,你如何解析? SAX、DOM、StAX,是完全不同的东西吗?
第四,你如何使用这些数据。您正在处理一个文件还是多个文件?您是否在解析后不小心保留了数据?代码示例在这里会有所帮助(并且某些第 3 方站点的链接对于未来的 SO 用户来说并不是很有用)。
Answering the question "How to Debug"
It starts with gathering the information that's missing from your post. Information that could potentially help future people having the same problem.
First, the complete stack trace. An out-of-memory exception that's thrown from within the XML parser is very different from one thrown from your code.
Second, the size of the XML file, because "not long at all" is completely useless. Is it 1K, 1M, or 1G? How many elements.
Third, how are you parsing? SAX, DOM, StAX, something completely different?
Fourth, how are you using the data. Are you processing one file or multiple files? Are you accidentally holding onto data after parsing? A code sample would help here (and a link to some 3rd-party site isn't terribly useful for future SO users).
好吧,我承认我用一个可能的替代方案来避免你的直接问题。您可能需要考虑使用 XStream 进行解析,而不是让它用更少的代码处理大部分工作。下面的粗略示例使用 64MB 堆解析 XML。请注意,它还需要 Apache Commons IO 才能轻松读取输入,从而允许黑客将
转换为
。Ok, I'll admit I'm avoiding your direct question with a possible alternative. You might want to consider parsing with XStream instead to let it deal with the bulk of the work with less code. My rough example below parses your XML with a 64MB heap. Note that it requires Apache Commons IO as well just to easily read the input just to allow the hack to turn the
<collection>
into a<list>
.我下载了你的代码,但我几乎从来没有这样做过。我可以 99% 肯定地说错误出在您的代码中:循环内的“if”不正确。它与 Digester 或 XML 没有任何关系。要么您犯了逻辑错误,要么您没有充分考虑要创建多少个对象。
但你猜怎么着:我不会告诉你你的错误是什么。
如果你无法从我上面给出的一些提示中弄清楚,那就太糟糕了。这与您在原始帖子中没有提供足够的信息(在原始帖子中)来实际开始调试而让所有其他受访者经历的情况相同。
也许您应该阅读 - 实际阅读 - 我以前的帖子,并用它要求的信息更新您的问题。或者,如果你懒得这么做,那就接受你的 F。
I downloaded your code, something that I almost never do. And I can say with 99% certainty that the bug is in your code: an incorrect "if" inside a loop. It has nothing whatsoever to do with Digester or XML. Either you've made a logic error or you didn't fully think through just how many objects you'd create.
But guess what: I'm not going to tell you what your bug is.
If you can't figure it out from the few hints that I've given above, too bad. It's the same situation that you put all of the other respondents through by not providing enough information -- in the original post -- to actually start debugging.
Perhaps you should read -- actually read -- my former post, and update your question with the information it requests. Or, if you can't be bothered to do that, accept your F.