XmlDocument缓存内存使用情况
我们发现使用 XmlDocument 的 .NET Web 应用程序内存使用率非常高。 一个小的 (~5MB) XML 文档被加载到 XmlDocument 对象中并存储在 HttpContext.Cache 中,以便在每次页面加载时轻松查询和 XSLT 转换。 XML 会定期在磁盘上修改,因此缓存对文件具有依赖性。
这样的应用程序似乎使用了数百兆字节的 RAM。
我已经尝试过在每个请求启动时请求垃圾收集,这使得 RAM 使用率大大降低,但我无法想象这是一个好的做法。
有谁对我们如何以较低的 RAM 使用量实现相同的目标有任何建议吗?
We are seeing very high memory usage in .NET web applications which use XmlDocument.
A small (~5MB) XML document is loaded into an XmlDocument object and stored in HttpContext.Cache for easy querying and XSLT transformation on each page load. The XML is modified on disk periodically so a cache has a dependency on the file.
Such an application appears to be using hundreds of megabytes of RAM.
I have experimented with requesting garbage collection on each request start, and this keeps the RAM usage far lower but I cannot imagine this is good practise.
Does anyone have any suggestions as to how we can achieve the same goal but with lower RAM usage?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我的两分钱。 。 。
如果内存使用量根据 XML 文档的大小呈指数增长,我会担心。例如,1mb XML 文件内存稳定在 10mb,2mb 稳定在 30mb,等等。
此外,不要过多考虑 XML 文件的字节大小成本,而应考虑每个节点的成本。如果您的 5mb XML 文档有两个数据节点,那么文档的内存中表示不会比 5mb 大很多(实际上它可能要小得多,考虑到 XML 中的二进制数据将是其在内存中的两倍)记忆)。
*
如果您的 XML 文档是 utf-8,并且您有两个大文本节点,那么内存中的表示形式可能是 10mb(文本可以存储在 .net 字符串中,这些字符串是 Unicode,并且将是标准英语 UTF-8 文本宽度的两倍)。如果 XML 文档由许多离散的字符串值组成,则每个节点都是一个对象,每个节点名称都是一个对象,每个节点值都是一个对象。因此,假设引用为 4 个字节,那么(至少)每个节点需要额外的 12 个字节。
现在,假设您有很多节点,并且假设节点名称+值的平均长度为 20 个字符,那么 5mb 文件的引用开销为 3mb,再加上 utf-8 到 Unicode 转换可能需要的额外 100%,它需要 5MB + 5mb + 3mb(至少)= 13mb(至少)内存来存储 5mb XML 文件。 。 。这还不包括内存对齐丢失的字节数,或者用于存储每个字符串对象
**
大小的额外字节数。还要考虑到,由于您正在缓存 XML 文档,因此所有这些对象都会立即成为第 2 代可收集对象,这基本上意味着 GC 将非常懒于遍历大量堆来查看它可以收集什么。
请参阅 Rico Mariani 的何时调用 GC.Collect()适用于不仅可以调用 GC Collect 的情况,而且还适用于有必要调用它的情况。
希望这会有所帮助,抱歉,如果我正在向合唱团宣讲内存大小的问题。
*
我不知道情况是否确实如此,但如果不是这样,我会感到惊讶。**
我假设 .net 字符串在字符串的实际字符之前/之后存储字符串的大小,这可能会显着增加每个节点的内存表示和额外 4-8 个字节,每 20 字节节点名称/值的成本为 20 字节。这有效地增加了与存储数据大小相匹配的开销。My two cents . . .
I'd be worried if memory use was exponential based on the size of the XML document. e.g. 1mb XML file memory settles at 10mb, 2mb flattens out at 30mb, etc.
Also, consider the cost of the XML file not so much on byte size, but on the cost of each node. If your 5mb XML doc had say two data nodes, then the in-memory representation of the document wouldn't be much greater than 5mb (actually it could be far less, considering that binary data in XML will be double what it will be in memory).
*
If your XML doc is utf-8, and you've two large text nodes, then the in-memory representation could be 10mb (the text could be stored in .net strings, which are Unicode, and will be twice the width of standard English language UTF-8 text).If the XML document is comprised of lots of discreet string values, then every node is an object, every node name is an object, every node value is an object. So assuming references are 4 bytes, that's (at least) an extra 12 bytes per node.
Now, assuming you've lots of nodes, and assume your average length of node name+value is 20 characters, then the reference overhead of a 5mb file is 3mb, plus a possible extra 100% for utf-8 to Unicode conversion, it takes 5MB + 5mb + 3mb(at least) = 13mb(at least) of ram to store a 5mb XML file . . . and that's not counting bytes lost to memory alignment, or the extra bytes used to store the size of each string object
**
.Also consider that because you're caching the XML document, all those objects immediately become generation 2 collectible objects, which basically means the GC will be very lazy about walking that considerable heap to see what it can collect.
See Rico Mariani's When to call GC.Collect() for the situations where it's not only OK to call GC Collect, but when it's necessary to call it.
Hope this helps, sorry if I'm preaching to the choir on the memory size thing.
*
I've no idea if this is actually the case, but would be surprised if it isn't.**
I'm assuming .net strings store the size of the string before/after the actual characters of the string, this could significantly increase the in-memory representation by and extra 4-8 bytes per node, giving at 20 byte cost per 20 bytes of node name/value. Which effectively increases the overhead to match the size of the data stored.由于积极的 GC 会清理一些东西,因此您应该寻找可能不会处置实现
IDisposable
的对象的位置。也许您需要使用 XSL 转换来查看代码,以确保其中使用的对象已正确处理。Since aggressive GCing cleans things up you should be looking for places where you may not be disposing objects that implement
IDisposable
. Perhaps you need to look at your code using the XSL Transform to be sure that objects used there are properly disposed.