限制 RAM 的使用。 (C#.NET)
有大约 100Mb 的巨大文件。 我想将它们加载到内存(RAM)中,进行处理并保存在某个地方。
同时我希望内存使用存在限制。例如,100Mb,对于我的应用程序,不要使用超过此内存限制的内存。如果超出限制,则文件将被处理。
我对此的理解:
var line = file.ReadLine();
var allowed = true;
while( allowed && line != null )
{
var newObject = new SomeObject( line );
list.add( newObject );
// Checking the memory
allowed = CheckUsedMemory();
line = file.ReadLine()
}
如何限制RAM的使用? 如何实现CheckUsedMemory方法? 谢谢。
UPD
谢谢大家的好建议。
There are huge files about 100Mb.
I want to load them into memory (RAM), process and save somewhere.
At the same time I want that a limit of memory usage exists. Example, 100Mb, to my app don't use more then this memory limit. If the limit is exceeded the file is processed parts.
My understanding of this:
var line = file.ReadLine();
var allowed = true;
while( allowed && line != null )
{
var newObject = new SomeObject( line );
list.add( newObject );
// Checking the memory
allowed = CheckUsedMemory();
line = file.ReadLine()
}
How to limit the use of RAM?
How to implement the CheckUsedMemory method?
Thank you.
UPD
Thank you everybody for good advices.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以尝试使用:
或
第一个将强制对内存进行垃圾收集(清理),因此速度较慢(毫秒)
然后阅读此内容以查看您的计算机有多少内存:
如何获得计算机拥有的 RAM 总量?
请记住,如果您作为 32 位应用程序运行,您不能使用所有内存,并且其他进程可能正在使用内存!
You can try with:
or
The first will force a garbage collecting (cleaning) of the memory, so it's slower (milliseconds)
Then read this to see how much memory your machine has:
How do you get total amount of RAM the computer has?
Remember that if you are running as a 32 bits app, you can't use all the memory, and that other processes could be using the memory!
首先,感谢您了解您的内存消耗。如果有更多的程序员如此体贴就好了..
其次,我不会打扰:也许用户希望您的应用程序运行得尽可能快,并且愿意消耗 8000 兆内存以获得速度快 5% 的结果。让他们。 :)
但是,如果您在进程中强制进行更多磁盘访问,人为地限制应用程序占用的内存量可能会大大增加处理时间。如果有人在内存受限的系统上运行,他们很可能已经有用于交换的磁盘流量 - 如果您在真正完成内存之前人为地转储内存,那么您只会进一步增加磁盘 IO,进入交换的方式。让操作系统处理这种情况。
最后,您在这里编写的访问模式(顺序、一次一行)非常很常见,毫无疑问,.NET 设计者已经投入了大量的精力来获取内存使用情况将这种模式降到最低限度。将对象分部分添加到内部树中是一个好主意,但很少有应用程序能够真正从中受益。 (合并排序是一种出色的应用程序,它可以从部分处理中受益匪浅。)
根据您对完成的对象列表所做的操作,您可能无法立即改进对整个列表的处理。或者,您可能会从拆解它中受益匪浅。 (如果 MapReduce 很好地描述了您的数据处理问题,那么也许您会从破坏事物中受益)
无论如何,我对使用“内存”作为决定何时分解处理的基准有点怀疑:我宁愿使用“1000 行输入”或“十级嵌套”或“运行机床五分钟”或基于输入的东西,而不是消耗内存的次要影响。
First, thanks for being aware of your memory consumption. If only more programmers were so considerate..
Second, I wouldn't bother: perhaps the user wants your application to run as fast as possible and is willing to burn 8000 megs of memory to get results 5% faster. Let them. :)
But, artificially limiting the amount of memory your application takes may drastically increase processing time, if you force more disk-accesses in the process. If someone is running on a memory-constrained system, they are liable to already have disk traffic for swapping -- if you are artificially dumping memory before you're really finished with it, you're only contributing further to disk IO, getting in the way of the swapping. Let the OS handle this situation.
And lastly, the access pattern you've written here (sequential, line-at-a-time) is very common, and doubtless the .NET designers have put huge amounts of effort into getting memory usage from this pattern to the bare minimum. Adding objects to your internal trees in parts is a nice idea, but very few applications can really benefit from this. (Merge sorting is one excellent application that benefits greatly from partial processing.)
Depending upon what you're doing with your finished list of objects, you might not be able to improve upon working with the entire list at once. OR, you might benefit greatly from breaking it apart. (If Map Reduce describes your data processing problem well, then maybe you would benefit from breaking things apart.)
In any event, I'd be a little leery of using "memory" as the benchmark for deciding when to break apart processing: I'd rather use "1000 lines of input" or "ten levels of nesting" or "ran machine tools for five minutes" or something that is based on the input, rather than the secondary effect of memory consumed.
正常的过程是不将所有内容加载到内存中,而是分块读取文件,处理它并保存它。如果您出于某种原因必须将所有内容保留在 RAM 中(例如排序),那么您很可能需要投资更多 RAM。
这是您使用的算法的问题,因此问题应该是如何在不使用太多内存的情况下解决特定任务。
GC.GetTotalMemory() 会告诉你有多少内存你正在使用。
100MB RAM 在今天已经不算多了。将其读入内存、处理并将其放回磁盘的速度可以相当快。请记住,无论如何,您都无法避免将其从磁盘复制到内存,然后再复制回磁盘。使用 StringBuilder(而不是 String)来保存它不一定会给应用程序增加太多开销。在一次操作中写入 100MB 肯定比一次写入一行要快。
Normal procedure is to not load everything in memory, but rather read the file in chunks, process it and save it. If you for some reason have to keep everything in RAM (say for sorting) then you may very well have to invest in more RAM.
This is an issue with the algorithm you are using, so the question should be about how to solve a specific task without using too much memory.
GC.GetTotalMemory() will tell you how much memory you are using.
100MB RAM is not much today. Reading it into memory, processing it and putting it back to disk could be made quite fast. Remember that you can't avoid copying it from disk to memory and back to disk anyway. Using a StringBuilder (not String) to hold it would not necessarily add too much overhead to the app. Writing 100MB in one operation is surely faster than one line at a time.
看起来您想要逐行处理文件,但知道在 .NET 4 中,您可以使用 内存映射文件,它可以让您稀疏地访问大文件
It looks like you want to process a file line-by-line, but it may help to know that, with .NET 4, you can use memory mapped files, which lets you access large files sparsely
您无法真正限制内存使用。您只能限制保留的内存量。剩余的内存是否被释放取决于垃圾收集器。
因此,我建议您在处理它们之前只对当前正在缓冲的行数(或者最好是字符数)感兴趣。
在评论中,人们建议您应该逐行阅读该文件。假设您能够一次处理单行文件,这是一个非常好的建议。无论如何,操作系统都会缓存该文件,因此您不会损失任何性能。
You cannot really limit the memory usage. You can only limit the amount of memory that you are keeping reserved. Whether the rest of the memory is freed or not is up to garbage collector.
So I would suggest that you take interest only in the number of lines (or preferably the number of characters) that you are currently buffering before you process them.
In comments people have suggested that you should read the file line by line. It is a very good advice assuming that you are able to process the file single line at a time. Operating system will cache the file anyway so you don't lose any performance.