生成报告时诊断 .NET OutOfMemoryException

发布于 2024-11-08 11:28:56 字数 867 浏览 0 评论 0原文

我的任务是以我认为合适的方式改进一段生成大量报告的代码。

生成了大约 10 个相同的报告（针对数据库的每个“部分”），它们的代码与此类似：

GeneratePurchaseReport(Country.France, ProductType.Chair);
GC.Collect();
GeneratePurchaseReport(Country.France, ProductType.Table);
GC.Collect();
GeneratePurchaseReport(Country.Italy, ProductType.Chair);
GC.Collect();
GeneratePurchaseReport(Country.Italy, ProductType.Table);
GC.Collect();

如果我删除这些 GC.Collect() 调用，报告服务就会崩溃与OutOfMemoryException。

大部分内存保存在一个巨大的 List 中，该列表填充在 GeneratePurchaseReport 中，一旦退出就不再使用 - 这就是为什么一个完整的内存GC收集会回收内存。

我的问题有两个：

为什么 GC 不自己做这件事？一旦第二个 GeneratePurchaseReport 上的内存不足，它应该在崩溃和燃烧之前进行完整的收集，不是吗？
是否存在我可以以某种方式提高的内存限制？我根本不介意数据是否交换到磁盘，但 .net 进程使用的内存甚至比可用的 2.5GB RAM 还要少！我预计它只会在地址空间耗尽时崩溃，但在 64 位机器上我怀疑这种情况会这么快发生。

原文

I'm tasked with improving a piece of code that generates massive reports, in any way I see fit.

There are about 10 identical reports generated (for each 'section' of the database) , and the code for them is similar to this:

GeneratePurchaseReport(Country.France, ProductType.Chair);
GC.Collect();
GeneratePurchaseReport(Country.France, ProductType.Table);
GC.Collect();
GeneratePurchaseReport(Country.Italy, ProductType.Chair);
GC.Collect();
GeneratePurchaseReport(Country.Italy, ProductType.Table);
GC.Collect();

If I remove those GC.Collect() calls, the reporting service crashes with OutOfMemoryException.

The bulk of the memory is kept in a massive List<T> which is filled inside GeneratePurchaseReport and is no longer of use as soon as it exits - which is why a full GC collection will reclaim the memory.

My question is two-fold:

Why doesn't the GC do this on its own? As soon as it's running out of memory on the second GeneratePurchaseReport it should do a full collection before crashing and burning, shouldn't it?
Is there a memory limit which I can raise somehow? I don't mind at all if data is swapped to disk, but the .net process is using far less memory than even the available 2.5GB of RAM! I'd expect it to only crash if it's run out of address space but on a 64-bit machine I doubt that happens so soon.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜妞爱困 2024-11-15 11:28:56

阅读大对象堆。

我认为正在发生的事情是，随着时间的推移，各个报告的最终文档会被构建并附加，这样在每次附加操作时都会创建一个新文档并丢弃旧文档（这可能发生在幕后）。该文档（最终）大于大对象堆上存储的 85,000 字节阈值。

在这种情况下，您实际上没有使用那么多的物理内存 - 它仍然可用于其他进程。您正在使用的是您的程序可用的地址空间。 Windows 中的每个进程都有自己的（通常）2GB 可用地址空间。随着时间的推移，当您分配不断增长的报告文档的新副本时，收集先前副本时会在 LOH 中留下许多漏洞。先前对象释放的内存实际上不再使用，可供其他进程使用，但地址空间仍然丢失；它是支离破碎的，需要被压缩。最终这个地址空间被填满，你会得到一个 OutOfMemory 异常。

有证据表明，调用 GC.Collect() 可以对 LOH 进行一些压缩，但这并不是一个完美的解决方案。我读到的关于这个主题的几乎所有其他内容都表明 GC.Collect() 根本不应该压缩 LOH，但我看到了一些轶事报告（一些在 Stack Overflow 上），其中调用 GC.Collect()事实上能够避免 LOH 碎片导致的 OutOfMemory 异常。

一个“更好”的解决方案（就确保您不会耗尽内存而言——使用 GC.Collect() 来压缩 LOH 并不可靠）是将您的报告分割成小于 85000 字节的单元，最后将它们全部写入单个缓冲区，或者使用不会随着增长而丢弃您之前的工作的数据结构。不幸的是，这可能需要更多的代码。

这里一个相对简单的选择是为大于最大报告的 MemoryStream 对象分配一个缓冲区，然后在构建报告时写入 MemoryStream。这样你就不会留下碎片。如果这只是写入磁盘，您甚至可以直接进入 FileStream（可能通过 TextWriter，以便以后轻松更改）。如果这个选项解决了您的问题，我希望在对此答案的评论中听到它。

Read up on the Large Object Heap.

I think what's happening is that the final document for individual reports is built and appended to over time, such that at each append operation a new document is created and the old is discarded (that probably happens behind the scenes). This document is (eventually) larger than the 85,000 byte threshold for storage on the Large Object Heap.

In this scenario, you're actually not using that much physical memory — it's still available for other processes. What you are using is address space that is available to your program. Every process in Windows has it's own (typically) 2GB address space available. Over time as you allocate new copies of your growing report document, you leave behind numerous holes in the LOH when the prior copy is collected. The memory freed by prior objects is not actually used anymore and is available for other processes, but the address space is still lost; it's fragmented and needs to be compacted. Eventually this address space fills up and you get an OutOfMemory exception.

The evidence suggests that calls to GC.Collect() allow for some compaction of the LOH, but it's not a perfect solution. Just about everything else I've read on the subject indicates that GC.Collect() is not supposed to compact the LOH at all, but I've seen several anecdotal reports (some here on Stack Overflow) where calling GC.Collect() was in fact able to avert OutOfMemory Exceptions from LOH fragmentation.

A "better" solution (in terms of being sure you won't ever run out of memory -- using GC.Collect() to compact the LOH just isn't reliable) is to splinter your report into units smaller than 85000 bytes, and write them all into a single buffer at the end, or using a data structure that doesn't throw away your prior work as it grows. Unfortunately, this is likely to be a lot more code.

One relatively simple option here is to allocate a buffer for a MemoryStream object that is bigger than your largest report, and then write into the MemoryStream as you build the report. This way you never leave fragments. If this is just written to disk you might even go right to a FileStream (perhaps via TextWriter, to make it easy to change later). It this option solves your problem, I'd like to hear about it in a comment to this answer.

回复收藏 0 原文