多个子进程+ 从流中读取

发布于 2024-07-20 04:05:33 字数 1597 浏览 7 评论 0原文

参考我的上一个问题(多个子进程),我现在尝试使用多个子进程。

...
fp = fopen(pathname, "r"); // open inputfile in r mode
fgets(trash, 10, fp); // ignore first line

for (i=0; i<numberOfProcess; ++i) {
    #ifdef DBG
        fprintf(stderr, "\nDBG: Calling fork()\n"); 
    #endif

    if ((pids[i] = fork()) < 0) {
        perror("fork error");
        exit(EXIT_FAILURE);

    } else if (pids[i] == 0) { // Child Code

        if (numbersToSort % numberOfProcess == 0) { // 16 % 4 = 0
            partialDataSize = numbersToSort / numberOfProcess;          

            for (j=0; j<partialDataSize; j++) { 
                fscanf(fp, "%d", &arrayPartialData[j]);
                qsort(arrayPartialData, partialDataSize, sizeof(int), (void *)comp_num);

                //printf("%d\n", arrayPartialData[j]);
                // TODO: qsort data until partialDataSize
            }

        } 
        printf("pid: %d child process %d outputs: ", getpid(), pids[i]);
        printArray(arrayPartialData, partialDataSize);
        //break;
        exit(0);
    }  
}   

/* Wait for children to exit. */

while (numberOfProcess > 0) {
    pid = wait(&status);
    --numberOfProcess;
}

fclose(fp);

但当然,由于 fscanf,此代码从输入文件中输出相同的排序整数序列。例如,如果输入文件的开头包含 5 1 4,则它输出:

(第一个子级) 1 4 5
(第二个子进程)1 4 5

(有两个子进程)..因为 fscanf 开始从输入流的开头读取整数。

我现在的问题是如何继续从上一个子进程离开的位置开始读取数字? 例如,如果输入文件包含 5 1 4 8 5 10,那么它可以输出:

(第一个孩子) 1 4 5

(第二个孩子) 5 8 10

提前致谢;)

referring to my last question (Multiple child process), i am now trying to make an external sorting implementation using multiple child process.

...
fp = fopen(pathname, "r"); // open inputfile in r mode
fgets(trash, 10, fp); // ignore first line

for (i=0; i<numberOfProcess; ++i) {
    #ifdef DBG
        fprintf(stderr, "\nDBG: Calling fork()\n"); 
    #endif

    if ((pids[i] = fork()) < 0) {
        perror("fork error");
        exit(EXIT_FAILURE);

    } else if (pids[i] == 0) { // Child Code

        if (numbersToSort % numberOfProcess == 0) { // 16 % 4 = 0
            partialDataSize = numbersToSort / numberOfProcess;          

            for (j=0; j<partialDataSize; j++) { 
                fscanf(fp, "%d", &arrayPartialData[j]);
                qsort(arrayPartialData, partialDataSize, sizeof(int), (void *)comp_num);

                //printf("%d\n", arrayPartialData[j]);
                // TODO: qsort data until partialDataSize
            }

        } 
        printf("pid: %d child process %d outputs: ", getpid(), pids[i]);
        printArray(arrayPartialData, partialDataSize);
        //break;
        exit(0);
    }  
}   

/* Wait for children to exit. */

while (numberOfProcess > 0) {
    pid = wait(&status);
    --numberOfProcess;
}

fclose(fp);

but of course this code outputs the same sequence of sorted integers from inputfile because of fscanf.. for example if the beginning of input file includes 5 1 4, then it outputs:

(1st child) 1 4 5
(2nd child) 1 4 5

(with two child process).. because fscanf starts to read integers from the beginning of input stream.

my problem now is how can i continue to read the numbers starting from the point where the previous child process left? for example, if input file includes 5 1 4 8 5 10, then it can output:

(1st child) 1 4 5

(2nd child) 5 8 10

thanks in advance;)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

街角迷惘 2024-07-27 04:05:33

我会使用较低级别的 open() 和 read() 而不是等效的流,否则您将不得不担心 stdio 缓冲区与底层文件描述符的同步。 请注意,您在读取完整数字时仍然会遇到问题,因此您可能需要在进程之间进行一些同步。

作为替代方案,我建议使用单个进程来读取文件并将行的子集写入进行排序的子进程(使用 pipeline()),然后将其写入另一个进行合并的进程。

I'd use the lower level open() and read() rather than the streams equivalent as otherwise you'll have to worry about synchronizing the stdio buffers with the underlying file descriptor. Note you'll still have issues reading complete numbers, so you'll probably need some sync between the processes.

As an alternative I would suggest a single process to read the file and write a subset of the lines to subprocesses that do the sorting (using pipe()), which they then write to another process doing the merge.

夜还是长夜 2024-07-27 04:05:33

如果您使用 fscanf,您唯一能做的就是让每个进程读取并丢弃数字,直到它到达它应该处理的那些数字。 在您的情况下,丢弃 i*partialdatasize 数字。

例如 5 7 3 1 4 8 5 10 2 你可能有
5 7 3

1 4 8

5 10 2

排序后得到

3 5 7

1 4 8

2 5 10。

然后你必须弄清楚如何合并排序结果。

If you're using fscanf, the only thing you can do is have each process read and discard numbers until it gets to those that it should work on. In your case discard i*partialdatasize numbers.

So e.g. 5 7 3 1 4 8 5 10 2 you might have
5 7 3

1 4 8

5 10 2

which would sort to give

3 5 7

1 4 8

2 5 10.

Then you have to work out how to merge the sorted results.

飘然心甜 2024-07-27 04:05:33

如果您可以将整数存储为二进制。 您可以让第一个线程读取它的块

fread(&arrayPartialData[j], sizeof(int), partialDataSize, fp);

,然后第二个线程可以跳过已读取的块(因为您知道每个块的大小)。 然后您可以从那里开始读取,而无需丢弃任何数据。

fseek(partialDataSize * threadNumber);

我还建议您使用线程,因为分叉非常昂贵。 线程教程

If you can store your integers as binary. You can have the first thread read it's block

fread(&arrayPartialData[j], sizeof(int), partialDataSize, fp);

Than the 2nd thread can skip the block which has already been read (Because you know the size of each block). Then you can begin reading from there, without needing to discard any data.

fseek(partialDataSize * threadNumber);

I also recommend you use threads, as forking is very expensive. threads tutorial

极致的悲 2024-07-27 04:05:33

您正在使用链接的频道。

来自glibc 13.5.1(重点是我的)

来自单个打开的通道共享相同的文件位置; 我们称之为关联渠道。 当您使用 fdopen 从描述符创建流时、使用 fileno 从流中获取描述符时、使用 dup 或 dup2 复制描述符时以及在 fork 期间继承描述符时,都会产生链接通道。

显然,您不能同时从两个流执行 I/O。

如果您一直在使用流进行 I/O(或刚刚打开该流),并且想要使用链接到它的另一个通道(流或描述符)进行 I/O,则必须首先清理您一直在使用的流。

You were working with linked channels.

from glibc 13.5.1 (emphasis is mine)

Channels that come from a single opening share the same file position; we call them linked channels. Linked channels result when you make a stream from a descriptor using fdopen, when you get a descriptor from a stream with fileno, when you copy a descriptor with dup or dup2, and when descriptors are inherited during fork.

Apparently, you can not do I/O from both the streams simultaneously.

If you have been using a stream for I/O (or have just opened the stream), and you want to do I/O using another channel (either a stream or a descriptor) that is linked to it, you must first clean up the stream that you have been using.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文