从 C 语言的文件中读取特定行数(scanf、fseek、fgets)
我有一个进程主机,它生成 N 个子进程,通过未命名管道与父进程进行通信。我必须能够:
- 让父亲打开文件,然后向每个孩子发送一个结构,告诉它必须从 min 行读取到 max 行;
- 这将同时发生,所以我不知道:
- 第一个如何划分 N 个地图的总行数,
- 第二个如何让每个孩子只读取它应该读取的行?
我的问题不涉及操作系统概念,只涉及文件操作:S
也许是 fseek?我无法 mmap 日志文件(有些日志文件超过 1GB)。
我会很感激一些想法。提前谢谢您
编辑:我试图让孩子们在不使用 fseek 和块值的情况下阅读相应的行,所以,有人可以告诉我这是否有效吗? :
//somewhere in the parent process:
FILE* logFile = fopen(filename, "r");
while (fgets(line, 1024, logFile) != NULL) {
num_lines++;
}
rewind(logFile);
int prev = 0;
for (i = 0; i < maps_nr; i++) {
struct send_to_Map request;
request.fp = logFile;
request.lower = lowLimit;
request.upper = highLimit;
if (i == 0)
request.minLine = 0;
else
request.minLine = 1 + prev;
if(i!=maps_nr-1)
request.maxLine = (request.minLine + num_lines / maps_nr) - 1;
else
request.maxLine = (request.minLine + num_lines / maps_nr)+(num_lines%maps_nr);
prev = request.maxLine;
}
//write this structure to respective pipe
//child process:
while(1) {
...
//reads the structure to pipe (and knows which lines to read)
int n=0, counter=0;
while (fgets(line, 1024, logFile) != NULL){
if (n>=minLine and n<=maxLine)
counter+= process(Line);//returns 1 if IP was found, in that line, between the low and high limit
n++;
}
//(...)
}
不知道能不能成功,只是想成功!即使这样,是否有可能优于读取整个文件并打印日志文件中找到的 ip 总数的单个进程?
I have a process master that spawns N child processes that communicate with the parent through unnamed pipes. I must be able to:
- make the father open the file and then send, to each child, a struct telling that it has to read from min to max line;
- this is going to happen at the same time, so I don't know:
- 1st how to divide total_lines for N maps and
- 2nd how do I make each child read just the lines it is supposed to?
My problem does not concern the O.S. concepts, only the file operations :S
Perhaps fseek? I can't mmap the log file (some have more than 1GB).
I would appreciate some ideas. Thank you in advance
EDIT: I'm trying to make the children read the respective lines without using fseek and the value of chunks, so, could someone please tell me if this is valid? :
//somewhere in the parent process:
FILE* logFile = fopen(filename, "r");
while (fgets(line, 1024, logFile) != NULL) {
num_lines++;
}
rewind(logFile);
int prev = 0;
for (i = 0; i < maps_nr; i++) {
struct send_to_Map request;
request.fp = logFile;
request.lower = lowLimit;
request.upper = highLimit;
if (i == 0)
request.minLine = 0;
else
request.minLine = 1 + prev;
if(i!=maps_nr-1)
request.maxLine = (request.minLine + num_lines / maps_nr) - 1;
else
request.maxLine = (request.minLine + num_lines / maps_nr)+(num_lines%maps_nr);
prev = request.maxLine;
}
//write this structure to respective pipe
//child process:
while(1) {
...
//reads the structure to pipe (and knows which lines to read)
int n=0, counter=0;
while (fgets(line, 1024, logFile) != NULL){
if (n>=minLine and n<=maxLine)
counter+= process(Line);//returns 1 if IP was found, in that line, between the low and high limit
n++;
}
//(...)
}
I don't know if it's going to work, I just to make it work! Even like this, is it possible to outperform a single process reading the whole file and printing the total number of ips found in the log file?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您不关心精确均匀地划分文件,并且行长度的分布在整个文件中有些均匀,则可以避免在父级中读取整个文件一次。
start
并读取chunk_size
字节。这是该策略的粗略草图。
编辑以简化一些事情。
编辑:以下是下面第 3 步和第 4 步的一些未经测试的代码。这一切都未经测试,我也没有注意过差一错误,但它让您了解了
fseek
和ftell
的用法,这听起来喜欢你正在寻找的东西。然后在您的子进程中(第 4 步),假设子进程有权访问
child_chunks[]
并知道其child_num
:If you don't care about dividing the file exactly evenly, and the distribution of line lengths is somewhat even over the entire file, you can avoid reading the entire file in the parent once.
start
and readschunk_size
bytes.That's a rough sketch of the strategy.
Edited to simplify things a bit.
Edit: here's some untested code for step 3, and step 4 below. This is all untested, and I haven't been careful about off-by-one errors, but it gives you an idea of the usage of
fseek
andftell
, which sounds like what you are looking for.Then in your child (step 4), assume the child has access to
child_chunks[]
and knows itschild_num
:我认为它可以帮助你: 阅读具体内容文本文件中的行范围
i think it can help you: Read specific range of lines form a text file