内存管理 - 如何以及何时将大对象写入磁盘
我正在开发一个应用程序,该应用程序可能会产生大量内存负载(> 5GB),但由于客户部署环境的原因,需要在 32 位和基于 .NET 2 的桌面上运行。到目前为止,我的解决方案是对这些大容量对象使用应用程序范围的数据存储,当将对象分配给存储时,存储会检查应用程序的总内存使用量,如果它接近限制,将开始将存储中的一些旧对象序列化到用户的临时文件夹,并在需要时将它们检索回内存。事实证明,这绝对是不可靠的,就像应用程序中的其他对象开始使用内存一样,商店不会提示清理并腾出空间。我确实考虑过使用弱指针来保存内存中的数据对象,它们在释放时被序列化到磁盘,但是这些对象似乎几乎立即被释放,特别是在调试时,导致应用程序的性能受到巨大影响正在连载一切。
我应该使用任何有用的模式/范式来处理这个问题吗?我已经用谷歌进行了广泛的搜索,但尚未发现任何有用的东西。
I am working on an application which has potential for a large memory load (>5gb) but has a requirement to run on 32bit and .NET 2 based desktops due to the customer deployment environment. My solution so far has been to use an app-wide data store for these large volume objects, when an object is assigned to the store, the store checks for the total memory usage by the app and if it is getting close to the limit it will start serialising some of the older objects in the store to the user's temp folder, retrieving them back into memory as and when they are needed. This is proving to be decidedly unreliable, as if other objects within the app start using memory, the store has no prompt to clear up and make space. I did look at using weak pointers to hold the in-memory data objects, with them being serialised to disk when they were released, however the objects seemed to be getting released almost immediately, especially in debug, causing a massive performance hit as the app was serialising everything.
Are there any useful patterns/paradigms I should be using to handle this? I have googled extensively but as yet haven't found anything useful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为虚拟内存应该可以帮助您解决这种情况?
无论如何,似乎怀疑您在任何给定时刻确实需要内存中的所有 5GB 数据 - 您不可能在任何给定时间处理所有这些数据 - 至少在听起来像是消费类 PC 上是不可能的。您没有详细介绍您的数据,但对我来说,对象本身的设计很糟糕,因为您需要将整个集合都存储在内存中才能使用它。您是否考虑过尝试将数据分段为更合理的单元,然后在需要处理数据之前从磁盘抢先加载数据?通过这种方式,您本质上会付出更稳定的性能权衡,但您会减少当前的
I thought virtual memory was supposed to have you covered in this situation?
Anyways, it seems suspect that you really need all 5gb of data in memory at any given moment - you can't possibly be processing all that data at any given time - at least not on what sounds like a consumer PC. You didn't go into detail about your data, but something to me smells like the object itself is poorly designed in the sense that you need the entire set to be in memory to work with it. Have you thought about trying to fragment out your data into more sensible units - and then do some preemptive loading of the data from disk, just before it needs to be processed? You'd essentially be paying a more constant performance trade-off this way, but you'd reduce your current thrashing issue.
也许您会选择管理内存映射文件并查看此处。在 .NET 2.0 中,您必须使用 PInvoke 来实现该功能。从 .NET 4.0 开始,您就拥有高效的内置功能 与 内存映射文件。
另请看一下:
http://msdn.microsoft.com/en-us/library/dd997372。 aspx
您无法在内存中有效存储 5GB 数据。在 32 位操作系统中,每个进程有 2 GB 限制,在 64 位 Windows-on-Windows
所以你可以选择:
采用 Google 的 Chrome 方式(和 FireFox 4)并保留部分进程之间的数据。如果您的应用程序在 64 位操作系统下启动并且您有某些理由将应用程序保持为 32 位,则它可能适用。但这并不是那么容易的方法。如果您没有 64 位操作系统,我想知道您从哪里获得 >5GB RAM?
如果您有 32 位操作系统,则任何解决方案都将基于文件。当您尝试将数据保留在内存中时(通过我想知道您如何在 32 位和每个进程 2 GB 的限制下在内存中寻址它们),操作系统只是不断地将部分数据(内存页)交换到磁盘,并一次又一次地恢复它们访问它。您会遭受很大的性能损失,并且您已经注意到了这一点(我从您的问题描述中猜测)。主要问题是操作系统无法预测何时需要一种数据以及何时需要另一种数据。所以它只是试图通过在磁盘上读取和写入内存页面来做到最好。
因此,您已经以低效的方式间接使用磁盘存储,MMF 只是以高效且受控的方式为您提供相同的解决方案。
您可以重新构建应用程序以使用 MMF,操作系统将帮助您实现高效的缓存。自己做一下快速测试 MMF 也许足以满足您的需求。
无论如何,除了基于文件之外,我没有看到任何其他解决方案可以处理大于可用 RAM 的数据集。通常最好直接控制数据操作,尤其是当出现如此大量的数据并需要处理时。
Maybe you go with Managing Memory-Mapped Files and look here. In .NET 2.0 you have to use PInvoke to that functions. Since .NET 4.0 you have efficient built-in functionality with MemoryMappedFile.
Also take a look at:
http://msdn.microsoft.com/en-us/library/dd997372.aspx
You can't store 5GB data in-memory efficiently. You have 2 GB limit per process in 32-bit OS and 4 GB limit per 32-bit process in 64-bit Windows-on-Windows
So you have choice:
Go in Google's Chrome way (and FireFox 4) and maintain potions of data between processes. It may be applicable if your application started under 64-bit OS and you have some reasons to keep your app 32-bit. But this is not so easy way. If you don't have 64-bit OS I wonder where you get >5GB RAM?
If you have 32-bit OS when any solution will be file-based. When you try to keep data in memory (thru I wonder how you address them in memory under 32-bit and 2 GB per process limit) OS just continuously swap portions of data (memory pages) to disk and restores them again and again when you access it. You incur great performance penalty and you already noticed it (I guessed from description of your problem). The main problem OS can't predict when you need one data and when you want another. So it just trying to do best by reading and writing memory pages on/from disk.
So you already use disk storage indirecltly in inefficient way, MMFs just give you same solution in efficient and controlled manner.
You can rearchitecture your application to use MMFs and OS will help you in efficient caching. Do the quick test by yourself MMF maybe good enough for your needs.
Anyway I don't see any other solution to work with dataset greater than available RAM other than file-based. And usually better to have direct control on data manipulation especially when such amount of data came and needs to be processed.
当您必须存储大量数据并保持可访问性时,有时最有用的解决方案是使用数据库等数据存储和管理系统。数据库(例如MySQL)可以存储许多典型的数据类型,当然也可以存储二进制数据。也许您可以将对象存储到数据库(直接或通过编程业务对象模型)并在需要时获取它。该解决方案有时可以解决数据管理(移动、备份、搜索、更新...)和存储(数据层)方面的许多问题 - 并且它与位置无关 - 也许这个观点可以帮助您。
When you have to store huge loads of data and mantain accessibility, sometimes the most useful solution is to use data store and management system like database. Database (MySQL for example) can store a lots of typical data types and of course binary data too. Maybe you can store your object to database (directly or by programming business object model) and get it when you need to. This solution sometimes can solve many problems with data managing (moving, backup, searching, updating...), and storage (data layer) - and it's location independent - mayby this point of view can help you.