多个子进程+ 从流中读取
参考我的上一个问题(多个子进程),我现在尝试使用多个子进程。
...
fp = fopen(pathname, "r"); // open inputfile in r mode
fgets(trash, 10, fp); // ignore first line
for (i=0; i<numberOfProcess; ++i) {
#ifdef DBG
fprintf(stderr, "\nDBG: Calling fork()\n");
#endif
if ((pids[i] = fork()) < 0) {
perror("fork error");
exit(EXIT_FAILURE);
} else if (pids[i] == 0) { // Child Code
if (numbersToSort % numberOfProcess == 0) { // 16 % 4 = 0
partialDataSize = numbersToSort / numberOfProcess;
for (j=0; j<partialDataSize; j++) {
fscanf(fp, "%d", &arrayPartialData[j]);
qsort(arrayPartialData, partialDataSize, sizeof(int), (void *)comp_num);
//printf("%d\n", arrayPartialData[j]);
// TODO: qsort data until partialDataSize
}
}
printf("pid: %d child process %d outputs: ", getpid(), pids[i]);
printArray(arrayPartialData, partialDataSize);
//break;
exit(0);
}
}
/* Wait for children to exit. */
while (numberOfProcess > 0) {
pid = wait(&status);
--numberOfProcess;
}
fclose(fp);
但当然,由于 fscanf,此代码从输入文件中输出相同的排序整数序列。例如,如果输入文件的开头包含 5 1 4,则它输出:
(第一个子级) 1 4 5
(第二个子进程)1 4 5
(有两个子进程)..因为 fscanf 开始从输入流的开头读取整数。
我现在的问题是如何继续从上一个子进程离开的位置开始读取数字? 例如,如果输入文件包含 5 1 4 8 5 10,那么它可以输出:
(第一个孩子) 1 4 5
(第二个孩子) 5 8 10
提前致谢;)
referring to my last question (Multiple child process), i am now trying to make an external sorting implementation using multiple child process.
...
fp = fopen(pathname, "r"); // open inputfile in r mode
fgets(trash, 10, fp); // ignore first line
for (i=0; i<numberOfProcess; ++i) {
#ifdef DBG
fprintf(stderr, "\nDBG: Calling fork()\n");
#endif
if ((pids[i] = fork()) < 0) {
perror("fork error");
exit(EXIT_FAILURE);
} else if (pids[i] == 0) { // Child Code
if (numbersToSort % numberOfProcess == 0) { // 16 % 4 = 0
partialDataSize = numbersToSort / numberOfProcess;
for (j=0; j<partialDataSize; j++) {
fscanf(fp, "%d", &arrayPartialData[j]);
qsort(arrayPartialData, partialDataSize, sizeof(int), (void *)comp_num);
//printf("%d\n", arrayPartialData[j]);
// TODO: qsort data until partialDataSize
}
}
printf("pid: %d child process %d outputs: ", getpid(), pids[i]);
printArray(arrayPartialData, partialDataSize);
//break;
exit(0);
}
}
/* Wait for children to exit. */
while (numberOfProcess > 0) {
pid = wait(&status);
--numberOfProcess;
}
fclose(fp);
but of course this code outputs the same sequence of sorted integers from inputfile because of fscanf.. for example if the beginning of input file includes 5 1 4, then it outputs:
(1st child) 1 4 5
(2nd child) 1 4 5
(with two child process).. because fscanf starts to read integers from the beginning of input stream.
my problem now is how can i continue to read the numbers starting from the point where the previous child process left? for example, if input file includes 5 1 4 8 5 10, then it can output:
(1st child) 1 4 5
(2nd child) 5 8 10
thanks in advance;)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会使用较低级别的 open() 和 read() 而不是等效的流,否则您将不得不担心 stdio 缓冲区与底层文件描述符的同步。 请注意,您在读取完整数字时仍然会遇到问题,因此您可能需要在进程之间进行一些同步。
作为替代方案,我建议使用单个进程来读取文件并将行的子集写入进行排序的子进程(使用 pipeline()),然后将其写入另一个进行合并的进程。
I'd use the lower level open() and read() rather than the streams equivalent as otherwise you'll have to worry about synchronizing the stdio buffers with the underlying file descriptor. Note you'll still have issues reading complete numbers, so you'll probably need some sync between the processes.
As an alternative I would suggest a single process to read the file and write a subset of the lines to subprocesses that do the sorting (using pipe()), which they then write to another process doing the merge.
如果您使用 fscanf,您唯一能做的就是让每个进程读取并丢弃数字,直到它到达它应该处理的那些数字。 在您的情况下,丢弃 i*partialdatasize 数字。
例如 5 7 3 1 4 8 5 10 2 你可能有
5 7 3
1 4 8
5 10 2
排序后得到
3 5 7
1 4 8
2 5 10。
然后你必须弄清楚如何合并排序结果。
If you're using fscanf, the only thing you can do is have each process read and discard numbers until it gets to those that it should work on. In your case discard i*partialdatasize numbers.
So e.g. 5 7 3 1 4 8 5 10 2 you might have
5 7 3
1 4 8
5 10 2
which would sort to give
3 5 7
1 4 8
2 5 10.
Then you have to work out how to merge the sorted results.
如果您可以将整数存储为二进制。 您可以让第一个线程读取它的块
,然后第二个线程可以跳过已读取的块(因为您知道每个块的大小)。 然后您可以从那里开始读取,而无需丢弃任何数据。
我还建议您使用线程,因为分叉非常昂贵。 线程教程
If you can store your integers as binary. You can have the first thread read it's block
Than the 2nd thread can skip the block which has already been read (Because you know the size of each block). Then you can begin reading from there, without needing to discard any data.
I also recommend you use threads, as forking is very expensive. threads tutorial
您正在使用链接的频道。
来自glibc 13.5.1(重点是我的)
显然,您不能同时从两个流执行 I/O。
You were working with linked channels.
from glibc 13.5.1 (emphasis is mine)
Apparently, you can not do I/O from both the streams simultaneously.