如何在 perl 中禁用 stdout 重定向到文件缓冲?

发布于 2024-11-18 00:09:07 字数 1386 浏览 2 评论 0原文

这是一个启动 10 个进程的脚本,每个进程向其 STDOUT 写入 100,000 行,该 STDOUT 是从父进程继承的:

#!/usr/bin/env perl
# buffer.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1; # don't think this does anything with syswrite...

# start 10 jobs which write 100,000 lines each
for (1 .. 10 ) {
    $pm->start and next;

    for my $j (1 .. 100_000) {
        syswrite(\*STDOUT,"$j\n");
    }

    $pm->finish;
}
$pm->wait_all_children;

如果我通过管道传输到另一个进程,一切都很好。

$ perl buffering.pl | wc -l
1000000

但是如果我通过管道传输到磁盘,则系统写入会互相破坏。

$ perl buffering.pl > tmp.txt ; wc -l tmp.txt
457584 tmp.txt

更重要的是,如果我在子进程中打开写入文件句柄并直接写入 tmp.txt:

#!/usr/bin/env perl
# buffering2.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1;

for (1 .. 10) {
    $pm->start and next;
    open my $fh, '>', 'tmp.txt';

    for my $j (1 .. 100_000) {
        syswrite($fh,"$j\n");
    }
    close $fh;

    $pm->finish;
}
$pm->wait_all_children;

tmp.txt 按预期有 1,000,000 行。

$ perl buffering2.pl; wc -l tmp.txt
100000 tmp.txt

因此通过“>”重定向到磁盘有某种缓冲,但重定向到进程没有?这是怎么回事?

Here's a script that launchs 10 processes, each writing 100,000 lines to its STDOUT, which is inherited from the parent:

#!/usr/bin/env perl
# buffer.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1; # don't think this does anything with syswrite...

# start 10 jobs which write 100,000 lines each
for (1 .. 10 ) {
    $pm->start and next;

    for my $j (1 .. 100_000) {
        syswrite(\*STDOUT,"$j\n");
    }

    $pm->finish;
}
$pm->wait_all_children;

If I pipe to another process, all is well..

$ perl buffering.pl | wc -l
1000000

But if I pipe to disk, the syswrites clobber each other.

$ perl buffering.pl > tmp.txt ; wc -l tmp.txt
457584 tmp.txt

What's more, if I open write-file handles in the child processes and write directly to tmp.txt:

#!/usr/bin/env perl
# buffering2.pl
use 5.10.0;
use strict;
use warnings FATAL => "all";
use autodie;

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);

$|=1;

for (1 .. 10) {
    $pm->start and next;
    open my $fh, '>', 'tmp.txt';

    for my $j (1 .. 100_000) {
        syswrite($fh,"$j\n");
    }
    close $fh;

    $pm->finish;
}
$pm->wait_all_children;

tmp.txt has 1,000,000 lines as expected.

$ perl buffering2.pl; wc -l tmp.txt
100000 tmp.txt

So redirection via '>' to disk has some sort of buffering but redirection to a process doesn't? What's the deal?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

你是年少的欢喜 2024-11-25 00:09:07

当您重定向整个 perl 脚本时,您会得到一个文件描述符(当您执行 > tmp.txt 时由 shell 创建,并由 perl 作为 stdout 继承),即 dup'd 给每个孩子。当您在每个子项中显式open时,您会获得不同的文件描述符(而不是原始文件的dup)。如果您将 open my $fh, '>', 'tmp.txt' 提升到循环之外,您应该能够复制 shell 重定向情况。

管道情况之所以有效,是因为您正在与管道而不是文件进行通信,并且它没有偏移量的概念,正如我上面所描述的,偏移量可能会无意中在内核中共享。

When you redirect the whole perl script you get one file descriptor (created by the shell when you do > tmp.txt and inherited as stdout by perl) which is dup'd to each child. When you explicitly open in each child you get different file descriptors (not dups of the original). You should be able to replicate the shell redirection case if you hoist open my $fh, '>', 'tmp.txt' out of your loop.

The pipe case works because you're talking to a pipe and not a file and it has no notion of offset which can be inadvertently shared in the kernel as I described above.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文