为什么 fopen/fgets 同时使用 mmap 和 read 系统调用来访问数据?
我有一个小示例程序,它只是 fopen
sa 文件并使用 fgets
来读取它。使用strace
,我注意到对fgets
的第一次调用运行mmap
系统调用,然后使用读取系统调用来实际读取内容文件的。在 fclose
上,文件被 munmap
编辑。如果我直接使用 open/read 打开读取文件,这显然不会发生。我很好奇这个 mmap
的目的是什么,以及它要完成什么任务。
在我的基于 Linux 2.6.31 的系统上,当虚拟内存需求很大时,这些 mmap 有时会挂起几秒钟,在我看来是不必要的。
示例代码:
#include <stdlib.h>
#include <stdio.h>
int main ()
{
FILE *f;
if ( NULL == ( f=fopen( "foo.txt","r" )))
{
printf ("Fail to open\n");
}
char buf[256];
fgets(buf,256,f);
fclose(f);
}
以下是运行上述代码时的相关 strace 输出:
open("foo.txt", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb8039000
read(3, "foo\nbar\n\n"..., 4096) = 9
close(3) = 0
munmap(0xb8039000, 4096) = 0
I have a small example program which simply fopen
s a file and uses fgets
to read it. Using strace
, I notice that the first call to fgets
runs a mmap
system call, and then read system calls are used to actually read the contents of the file. on fclose
, the file is munmap
ed. If I instead open read the file with open/read directly, this obviously does not occur. I'm curious as to what is the purpose of this mmap
is, and what it is accomplishing.
On my Linux 2.6.31 based system, when under heavy virtual memory demand these mmap
s will sometimes hang for several seconds, and appear to me to be unnecessary.
The example code:
#include <stdlib.h>
#include <stdio.h>
int main ()
{
FILE *f;
if ( NULL == ( f=fopen( "foo.txt","r" )))
{
printf ("Fail to open\n");
}
char buf[256];
fgets(buf,256,f);
fclose(f);
}
And here is the relevant strace output when the above code is run:
open("foo.txt", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb8039000
read(3, "foo\nbar\n\n"..., 4096) = 9
close(3) = 0
munmap(0xb8039000, 4096) = 0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
它不是被
mmap
编辑的文件 - 在这种情况下,mmap
是匿名使用的(不是在文件上),可能是为后续读取的缓冲区分配内存。使用。malloc
实际上会导致对mmap
的调用。类似地,munmap 对应于对 free 的调用。It's not the file that is
mmap
'ed - in this casemmap
is used anonymously (not on a file), probably to allocate memory for the buffer that the consequent reads will use.malloc
in fact results in such a call tommap
. Similarly, themunmap
corresponds to a call tofree
.mmap
没有映射文件;相反,它为 stdioFILE
缓冲分配内存。通常malloc
不会使用mmap
来服务这么小的分配,但 glibc 的 stdio 实现似乎直接使用mmap
来获取缓冲区。这可能是为了确保它是页面对齐的(尽管 posix_memalign 可以实现相同的效果)和/或确保关闭文件将缓冲区内存返回给内核。我质疑页面对齐缓冲区的用处。大概是为了性能,但我看不出它有任何帮助,除非您正在读取的文件偏移量也是页面对齐的,即使如此,它似乎也是一个可疑的微优化。The
mmap
is not mapping the file; instead it's allocating memory for the stdioFILE
buffering. Normallymalloc
would not usemmap
to service such a small allocation, but it seems glibc's stdio implementation is usingmmap
directly to get the buffer. This is probably to ensure it's page-aligned (thoughposix_memalign
could achieve the same thing) and/or to make sure closing the file returns the buffer memory to the kernel. I question the usefulness of page-aligning the buffer. Presumably it's for performance, but I can't see any way it would help unless the file offset you're reading from is also page-aligned, and even then it seems like a dubious micro-optimization.据我所知,内存映射函数在处理大文件时非常有用。现在我不知道大的定义。但对于大文件来说,与“缓冲”I/O 调用相比,它们的速度要快得多。
在您发布的示例中,我认为该文件是由 open() 函数打开的,并且 mmap 用于分配内存或其他内容。
从 mmap 函数的语法中可以清楚地看到这一点:
void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off);
最后一个参数采用文件描述符应该是非负的。
而在堆栈跟踪中它是
-1
from what i have read memory mapping functions are useful while handling large files. now the definition of large is something i have no idea about. but yes for the large files they are significantly faster as compared to the 'buffered' i/o calls.
in the example that you have posted i think the file is opened by the
open()
function and mmap is used for allocating memory or something else.from the syntax of mmap function this can be seen clearly:
void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off);
the second last parameter takes the file descriptor which should be non-negative.
while in the stack trace it is
-1
glibc中fopen的源码显示mmap是可以实际使用的。
https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofopen.c;h=965d21cd978f3acb25ca23152993d9cac9f120e3;hb=HEAD#l36
Source code of fopen in glibc shows that mmap can be actually used.
https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofopen.c;h=965d21cd978f3acb25ca23152993d9cac9f120e3;hb=HEAD#l36