使 fgets 在 Linux 上发出更长的 read() 调用
我正在使用 fgets
读取相当大的文本行(最多 128K)。我看到服务器上有过多的上下文切换,使用 strace
我看到以下内容:
read(3, "9005 10218 00840023102015 201008"..., 4096) = 4096
即 fgets
一次读取 4096 字节的块。有没有办法控制调用 read() 时 fgets 使用多大的块?
I'm reading quite large lines(up to 128K) of text using fgets
. I'm seeing excessive context switching on the server, using strace
I see the following:
read(3, "9005 10218 00840023102015 201008"..., 4096) = 4096
i.e. fgets
reads chunks of 4096 bytes at a time. Is there any way to control how big chunks fgets
uses to when calling read()
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
setvbuf
显然是一个起点。setvbuf
would be the obvious place to start.函数 fgets() 是 stdio 包的一部分,因此它必须以与使用 fgetc() 一致的方式缓冲(或不缓冲)输入流>、
fscanf()
、fread()
等等。这意味着缓冲区本身(如果流被缓冲)是 FILE 对象的属性。是否有缓冲区,如果有缓冲区,缓冲区有多大,可以通过调用
setvbuf()
向库建议。库实现有相当大的自由度来忽略提示并执行它认为最好的操作,但是大小为“合理”的 2 的幂的缓冲区通常会被接受。您已经注意到默认值是 4096,这显然小于最佳值。
如果在实际文件上打开流,则默认情况下会缓冲该流。它在管道、FIFO、TTY 或其他任何设备上的缓冲可能具有不同的默认值。
The function
fgets()
is part of the stdio package, and as such it must buffer (or not) the input stream in a way that is consistent with also usingfgetc()
,fscanf()
,fread()
and so forth. That means that the buffer itself (if the stream is buffered) is the property of theFILE
object.Whether there is a buffer or not, and if buffered, how large the buffer is, can be suggested to the library by calling
setvbuf()
.The library implementation has a fair amount of latitude to ignore hints and do what it thinks best, but buffers that are "reasonable" powers of two in size will usually be accepted. You've noticed that the default was 4096, which is clearly smaller than optimal.
The stream is buffered by default if it is opened on an actual file. Its buffering on a pipe, FIFO, TTY or anything else potentially has different defaults.