如何在 perl 中禁用 stdout 重定向到文件缓冲？

发布于 2024-11-18 00:09:07 字数 1386 浏览 2 评论 0原文

这是一个启动 10 个进程的脚本，每个进程向其 STDOUT 写入 100,000 行，该 STDOUT 是从父进程继承的：

#!/usr/bin/env perl
# buffer.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1; # don't think this does anything with syswrite...

# start 10 jobs which write 100,000 lines each
for (1 .. 10 ) {
    $pm->start and next;

    for my $j (1 .. 100_000) {
        syswrite(\*STDOUT,"$j\n");
    }

    $pm->finish;
}
$pm->wait_all_children;

如果我通过管道传输到另一个进程，一切都很好。

$ perl buffering.pl | wc -l
1000000

但是如果我通过管道传输到磁盘，则系统写入会互相破坏。

$ perl buffering.pl > tmp.txt ; wc -l tmp.txt
457584 tmp.txt

更重要的是，如果我在子进程中打开写入文件句柄并直接写入 tmp.txt：

#!/usr/bin/env perl
# buffering2.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1;

for (1 .. 10) {
    $pm->start and next;
    open my $fh, '>', 'tmp.txt';

    for my $j (1 .. 100_000) {
        syswrite($fh,"$j\n");
    }
    close $fh;

    $pm->finish;
}
$pm->wait_all_children;

tmp.txt 按预期有 1,000,000 行。

$ perl buffering2.pl; wc -l tmp.txt
100000 tmp.txt

因此通过“>”重定向到磁盘有某种缓冲，但重定向到进程没有？这是怎么回事？

原文

Here's a script that launchs 10 processes, each writing 100,000 lines to its STDOUT, which is inherited from the parent:

#!/usr/bin/env perl
# buffer.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1; # don't think this does anything with syswrite...

# start 10 jobs which write 100,000 lines each
for (1 .. 10 ) {
    $pm->start and next;

    for my $j (1 .. 100_000) {
        syswrite(\*STDOUT,"$j\n");
    }

    $pm->finish;
}
$pm->wait_all_children;

If I pipe to another process, all is well..

$ perl buffering.pl | wc -l
1000000

But if I pipe to disk, the syswrites clobber each other.

$ perl buffering.pl > tmp.txt ; wc -l tmp.txt
457584 tmp.txt

What's more, if I open write-file handles in the child processes and write directly to tmp.txt:

#!/usr/bin/env perl
# buffering2.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1;

for (1 .. 10) {
    $pm->start and next;
    open my $fh, '>', 'tmp.txt';

    for my $j (1 .. 100_000) {
        syswrite($fh,"$j\n");
    }
    close $fh;

    $pm->finish;
}
$pm->wait_all_children;

tmp.txt has 1,000,000 lines as expected.

$ perl buffering2.pl; wc -l tmp.txt
100000 tmp.txt

So redirection via '>' to disk has some sort of buffering but redirection to a process doesn't? What's the deal?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你是年少的欢喜 2024-11-25 00:09:07

当您重定向整个 perl 脚本时，您会得到一个文件描述符（当您执行 > tmp.txt 时由 shell 创建，并由 perl 作为 stdout 继承），即 dup'd 给每个孩子。当您在每个子项中显式open时，您会获得不同的文件描述符（而不是原始文件的dup）。如果您将 open my $fh, '>', 'tmp.txt' 提升到循环之外，您应该能够复制 shell 重定向情况。

管道情况之所以有效，是因为您正在与管道而不是文件进行通信，并且它没有偏移量的概念，正如我上面所描述的，偏移量可能会无意中在内核中共享。