C:多进程stdio追加模式
我用 C 编写了这段代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
void random_seed(){
struct timeval tim;
gettimeofday(&tim, NULL);
double t1=tim.tv_sec+(tim.tv_usec/1000000.0);
srand (t1);
}
void main(){
FILE *f;
int i;
int size=100;
char *buf=(char*)malloc(size);
f = fopen("output.txt", "a");
setvbuf (f, buf, _IOFBF, size);
random_seed();
for(i=0; i<200; i++){
fprintf(f, "[ xx - %d - 012345678901234567890123456789 - %d]\n", rand()%10, getpid());
fflush(f);
}
fclose(f);
free(buf);
}
这段代码以附加模式打开一个文件并附加 200 次字符串。 我设置大小为 100 的 buf 可以包含完整的字符串。 然后我使用这个 bash 脚本创建了运行此代码的多进程:
#!/bin/bash
gcc source.c
rm output.txt
for i in `seq 1 100`;
do
./a.out &
done
我期望在输出中字符串永远不会混淆,因为我读到,当使用 O_APPEND 标志打开文件时,文件偏移量将设置为文件末尾在每次写入之前,我正在使用完全缓冲的流,但我得到每个进程的第一行是这样混合的:
[ xx - [ xx - 7 - 012345678901234567890123456789 - 22545]
后面的一些行
2 - 012345678901234567890123456789 - 22589]
看起来写入因调用 rand 函数而被中断。
那么...为什么会出现这些线条呢? 防止这种情况的唯一方法是使用文件锁...即使我只使用追加模式?
提前致谢!
I wrote this code in C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
void random_seed(){
struct timeval tim;
gettimeofday(&tim, NULL);
double t1=tim.tv_sec+(tim.tv_usec/1000000.0);
srand (t1);
}
void main(){
FILE *f;
int i;
int size=100;
char *buf=(char*)malloc(size);
f = fopen("output.txt", "a");
setvbuf (f, buf, _IOFBF, size);
random_seed();
for(i=0; i<200; i++){
fprintf(f, "[ xx - %d - 012345678901234567890123456789 - %d]\n", rand()%10, getpid());
fflush(f);
}
fclose(f);
free(buf);
}
This code opens in append mode a file and attaches 200 times a string.
I set the buf of size 100 that can contains the full string.
Then I created multi processes running this code by using this bash script:
#!/bin/bash
gcc source.c
rm output.txt
for i in `seq 1 100`;
do
./a.out &
done
I expected that in the output the strings are never mixed up, as I read that when opening a file with O_APPEND flag the file offset will be set to the end of the file prior to each write and i'm using a fully buffered stream, but i got the first line of each process is mixed as this:
[ xx - [ xx - 7 - 012345678901234567890123456789 - 22545]
and some lines later
2 - 012345678901234567890123456789 - 22589]
It looks like the write is interrupted for calling the rand function.
So...why appear these lines?
Is the only way to prevent this the use file locks...even if i'm using only the append mode?
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您需要自己实现某种形式的并发控制,POSIX 不保证来自多个进程的并发写入。您可以为管道提供一些保证,但不能为从不同进程写入的常规文件提供保证。
引用 POSIX
write()
:(在“基本原理”部分的末尾。)
You will need to implement some form of concurrency control yourself, POSIX makes no guarantees with respect to concurrent writes from multiple processes. You get some guarantees for pipes, but not for regular files written to from different processes.
Quoting POSIX
write()
:(At the end of the Rationale section.)
您以完全缓冲模式打开文件。这意味着输出的每一行首先进入缓冲区,当缓冲区溢出时,无论它是否包含不完整的行,它都会刷新到文件中。这会导致同时写入同一文件的不同进程的输出块交错。
一个简单的修复方法是以行缓冲模式
_IOLBF
打开文件,以便在每个完整行上刷新缓冲区。只需确保缓冲区大小至少与最长的行一样大,否则最终会写入不完整的行。缓冲区通常通过单个write()
系统调用进行刷新,以便来自不同进程的行不会相互交错。虽然不能保证 write() 系统调用对于不同的文件系统是原子的,但它通常会按预期工作,因为 write() 通常会锁定内核中的文件描述符继续之前的互斥体。
You open the file in the fully buffered mode. That means that every line of the output first goes into the buffer and when the buffer overflows it gets flushed to the file regardless whether it contains incomplete lines. That causes chunks of output from different processes writing into the same file concurrently to be interleaved.
An easy fix would be to open the file in line buffered mode
_IOLBF
, so that the buffer gets flushed on each complete line. Just make sure that the buffer size is at least as big as your longest line, otherwise it will end up writing incomplete lines. The buffer is normally flushed with a singlewrite()
system call, so that lines from different processes won't interleave each other.There is no guarantee that
write()
system call is atomic for different filesystems though, but it normally works as expected becausewrite()
normally locks the file descriptor in the kernel with a mutex before proceeding.