mmap 是如何工作的?
我正在 Linux 中开发程序,需要来自硬盘驱动器的 mmap 文件,但我有一个问题,什么会导致它失败。就像如果所有的内存都是碎片的,每个只有200M,但我想将一个文件mmap到1000M的内存,会成功吗?
还有一个问题,linux中有没有像Windows中的一些工具一样可以回收内存的工具,例如xp的内置工具。
谢谢。
I am working on programs in Linux which needs mmap file from harddrive, but i have a question, what can make it fail. Like if all the memories are fragmented, which has only 200M each, but i want to mmap a file to a memory of 1000M, will it succeed??
And another question, are there any tools in linux for recollect memory like some tools in Windows, e.g. the built-in tool for xp.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
mmap()
使用程序堆区域之外的地址,因此堆碎片不是问题,除非它会使堆占用更多空间,并减少映射的可用空间。如果您有大量映射文件,则可能会在地址空间相对受限的 32 位系统上遇到碎片问题。在 64 位系统上,碎片不太可能成为问题,因为即使现有映射之间只有很小的可用区域,仍然有大量可用的连续地址空间,与现有映射相邻。
32 位系统上更常见的问题是地址空间太小,根本无法映射大文件。在 4GB 地址空间中,通常有 2GB 可供用户空间使用,另外 2GB 由内核保留。在可用的 2GB 中,您的映射必须与程序的代码以及堆栈(通常很小)和堆(可能很大)共享空间。
简而言之,如果文件太大,
mmap()
在 32 位系统上通常会失败,但在 64 位系统上不太可能有足够大的文件导致该问题。如果您正在创建私有的写时复制映射,它也可能由于缺少交换空间而失败。内核必须确保可用 RAM 和交换空间的总和足够大,以容纳映射的大小,以防您修改所有页面,从而迫使内核创建所有页面的私有副本。共享映射不应该有这个问题,因为更改可以刷新到磁盘上的文件,然后如果内存不足,可以丢弃页面并稍后从磁盘重新加载。
当然,如果您无权访问该文件,或者它不是可以映射的文件类型(例如目录或套接字),映射也可能会失败。
不清楚你所说的回忆记忆是什么意思。请记住,
mmap()
消耗的稀缺资源不是内存,而是地址空间。即使机器实际上只有 128MB RAM,您也可以映射 1GB 文件,但在 32 位系统上,即使机器有 16GB RAM,您也无法映射 4GB 文件。虚拟内存的概念对于理解
mmap()
的作用至关重要,所以如果您还不熟悉它,请阅读它。mmap()
uses addresses outide your program's heap area, so heap fragmentation isn't a problem, except to the extent that it can make the heap take up more space, and reduce the available space for mappings.If you have lots of mapped files, you could potentially run into problems with fragmentation on a 32-bit system where the address space is relatively constrained. On a 64-bit system, fragmentation is unlikely to be a problem because even if you have only small regions available between existing mappings, there's still lots and lots of available contiguous address space, adjacent to the existing mappings.
The more common problem on a 32-bit system is that the address space is just too small to map large files at all. Of the 4GB address space, typically 2GB is available to userspace, with the other 2GB being reserved by the kernel. Of that available 2GB, your mappings have to share space with the program's code and stacks (typically small) and heap (potentially large).
In short,
mmap()
can often fail on 32-bit systems if the file is too large, but you're unlikely to ever have a file large enough to cause that problem on a 64-bit system.If you're creating a private copy-on-write mapping, it can also fail due to lack of swap space. The kernel has to ensure that the sum of available RAM and swap is large enough to hold the size of your mapping, in case you modify all the pages so that the kernel is forced to make private copies of them all. A shared mapping shouldn't have this problem, since changes can be flushed to the file on disk, and then the pages can be discarded if memory is scarce and reloaded from disk later.
Of course, a mapping can also fail if you don't have permission to access the file, or if it's not a type of file that can be mapped (such as a directory or a socket).
It's not clear what you mean about recollecting memory. Remember that the scarce resource that
mmap()
consumes isn't memory, it's address space. You can potentially map a 1GB file even if the machine actually only has 128MB of RAM, but on a 32-bit system you can't map a 4GB file even if the machine has 16GB of RAM.The concept of virtual memory is essential to understanding what
mmap()
does, so read about that if you're not familiar with it already.mmap
通过操作进程的页表来工作,页表是 CPU 用于映射地址空间的数据结构。 CPU 将将“虚拟”地址转换为“物理”地址,并根据<由内核设置的 href="http://en.wikipedia.org/wiki/Page_table" rel="noreferrer">页表。当您第一次访问映射内存时,您的 CPU 会生成页面错误。然后,操作系统内核可以跳入其中,通过分配内存并在新分配的缓冲区中执行文件 I/O 来“修复”无效的内存访问,然后继续执行程序,就好像什么也没发生一样。
如果您的进程的地址空间不足,
mmap
可能会失败,对于 32 位代码来说需要注意这一点,其中所有可用地址都可以通过大型数据集快速映射。对于手册页的“错误”部分中提到的任何事情,它也可能会失败。如果内核在分配内存或执行 I/O 时出现问题,则访问映射区域内的内存也可能会失败。在这种情况下,您的进程将收到
SIGBUS
信号。mmap
works by manipulating your process's page table, a data structure your CPU uses to map address spaces. The CPU will translate "virtual" addresses to "physical" ones, and does so according to the page table set up by your kernel.When you access the mapped memory for the first time, your CPU generates a page fault. The OS kernel can then jump in, "fix up" the invalid memory access by allocating memory and doing file I/O in that newly allocated buffer, then continue your program's execution as if nothing happened.
mmap
can fail if your process is out of address space, something to watch out for these days for 32-bit code, where all usable address can be mapped pretty quickly with large data sets. It can also fail for any of the things mentioned in the "Errors" section of the manpage.Accessing memory inside a mapped region can also fail if the kernel has issues allocating memory or doing I/O. In that case your process will get a
SIGBUS
signal.