Perl:避免从标准输入贪婪读取?

发布于 2024-09-19 02:25:20 字数 557 浏览 3 评论 0原文

考虑以下 perl 脚本 (read.pl):

my $line = <STDIN>;
print "Perl read: $line";
print "And here's what cat gets: ", `cat -`;

如果从命令行执行此脚本,它将获取第一行输入,而 cat 获取其他所有内容,直到输入结束(按下^D)。

然而,当输入从另一个进程通过管道传输或从文件读取时,情况会有所不同:

$ echo "foo\nbar" | ./read.pl
Perl read: foo
And here's what cat gets:

Perl 似乎在某处逐渐缓冲整个输入,并且使用反引号或系统调用的进程看不到任何输入。

问题是我想对混合 和调用其他进程的脚本进行单元测试。最好的方法是什么?我可以关闭 perl 中的输入缓冲吗?或者我可以以“模仿”终端的方式假脱机数据吗?

Consider the following perl script (read.pl):

my $line = <STDIN>;
print "Perl read: $line";
print "And here's what cat gets: ", `cat -`;

If this script is executed from the command line, it will get the first line of input, while cat gets everything else until the end of input (^D is pressed).

However, things are different when the input is piped from another process or read from a file:

$ echo "foo\nbar" | ./read.pl
Perl read: foo
And here's what cat gets:

Perl seems to greadily buffer the entire input somewhere, and processes called using backticks or system do no see any of the input.

The problem is that I'd like to unit test a script that mixes <STDIN> and calls to other processes. What would be the best way to do this? Can I turn off input buffering in perl? Or can I spool the data in a way that will "mimic" a terminal?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

坦然微笑 2024-09-26 02:25:20

这不是 Perl 问题。这是一个 UNIX/shell 问题。当您运行不带管道的命令时,您处于行缓冲模式,但是当您使用管道重定向时,您处于块缓冲模式。您可以通过以下内容看到这一点:

cat /usr/share/dict/words | ./read.pl | head

这个 C 程序也有同样的问题:

#include <stdio.h>

int main(int argc, char** argv) {
    char line[4096];
    FILE* cat;
    fgets(line, 4096, stdin);
    printf("C got: %s\ncat got:\n", line);
    cat = popen("cat", "r");
    while (fgets(line, 4096, cat)) {
        printf("%s", line);
    }
    pclose(cat);
    return 0;
}

This is not a Perl problem. It is a UNIX/shell problem. When you run a command without pipes you are in line buffering mode, but when you redirect with pipes, you are in block buffering mode. You can see this by saying:

cat /usr/share/dict/words | ./read.pl | head

This C program has the same problem:

#include <stdio.h>

int main(int argc, char** argv) {
    char line[4096];
    FILE* cat;
    fgets(line, 4096, stdin);
    printf("C got: %s\ncat got:\n", line);
    cat = popen("cat", "r");
    while (fgets(line, 4096, cat)) {
        printf("%s", line);
    }
    pclose(cat);
    return 0;
}
鸠书 2024-09-26 02:25:20

我有好消息和坏消息。

好消息是对 read.pl 的简单修改允许您给它假输入:

#! /usr/bin/perl

use warnings;
use strict;

binmode STDIN, "unix" or die "$0: binmode: $!";

my $line = <STDIN>;
print "Perl read: $line";
print "And here's what cat gets: ", `cat -`;

示例运行:

$ printf "A\nB\nC\nD\n" | ./read.pl 
Perl read: A
And here's what cat gets: B
C
D

坏消息是您会得到一次切换:如果您尝试重复 read-then-cat ,第一个 cat 将使所有后续读取陷入饥饿。要看到这一点,请考虑

#! /usr/bin/perl

use warnings;
use strict;

binmode STDIN, "unix" or die "$0: binmode: $!";

my $line = <STDIN>;
print "1: Perl read: $line";
print "1: And here's what cat gets: ", `cat -`;
$line = <STDIN>;
$line = "<undefined>\n" unless defined $line;
print "2: Perl read: $line";
print "2: And here's what cat gets: ", `cat -`;

一个示例运行,该运行会产生

$ printf "A\nB\nC\nD\n" | ./read.pl 
1: Perl read: A
1: And here's what cat gets: B
C
D
2: Perl read: <undefined>
2: And here's what cat gets: 

I have good news and bad news.

The good news is a simple modification of read.pl allows you to give it fake input:

#! /usr/bin/perl

use warnings;
use strict;

binmode STDIN, "unix" or die "$0: binmode: $!";

my $line = <STDIN>;
print "Perl read: $line";
print "And here's what cat gets: ", `cat -`;

Sample run:

$ printf "A\nB\nC\nD\n" | ./read.pl 
Perl read: A
And here's what cat gets: B
C
D

The bad news is you get a single switchover: if you try to repeat the read-then-cat, the first cat will starve all subsequent reads. To see this, consider

#! /usr/bin/perl

use warnings;
use strict;

binmode STDIN, "unix" or die "$0: binmode: $!";

my $line = <STDIN>;
print "1: Perl read: $line";
print "1: And here's what cat gets: ", `cat -`;
$line = <STDIN>;
$line = "<undefined>\n" unless defined $line;
print "2: Perl read: $line";
print "2: And here's what cat gets: ", `cat -`;

and then a sample run that produces

$ printf "A\nB\nC\nD\n" | ./read.pl 
1: Perl read: A
1: And here's what cat gets: B
C
D
2: Perl read: <undefined>
2: And here's what cat gets: 
听不够的曲调 2024-09-26 02:25:20

今天我想我已经找到了我需要的东西:Perl 有一个名为 Expect 的模块,它非常适合此类情况:

#!/usr/bin/perl

use strict;
use warnings;

use Expect;

my $exp = Expect->spawn('./read.pl');
$exp->send("First Line\n");
$exp->send("Second Line\n");
$exp->send("Third Line\n");
$exp->soft_close();

就像一个魅力;)

Today I think I've found what I needed: Perl has a module called Expect which is perfect for such situations:

#!/usr/bin/perl

use strict;
use warnings;

use Expect;

my $exp = Expect->spawn('./read.pl');
$exp->send("First Line\n");
$exp->send("Second Line\n");
$exp->send("Third Line\n");
$exp->soft_close();

Works like a charm ;)

も让我眼熟你 2024-09-26 02:25:20

这是我发现的一种次优方法:

use IPC::Run;

my $input = "First Line\n";
my $output;
my $process = IPC::Run::start(['./read.pl'], \$input, \$output);
$process->pump() until $output =~ /Perl read:/;
$input .= "Second Line\n";
$process->finish();
print $output;

它是次优的,因为人们需要知道程序在等待更多输入之前将发出的“提示”。

另一个次优解决方案如下:

use IPC::Run;

my $input = "First Line\n";
my $output;
my $process = IPC::Run::start(['./read.pl'], \$input, my $timer = IPC::Run::timer(1));
$process->pump() until $timer->is_expired();
$timer->start(1);
$input .= "Second Line\n";
$process->finish();

它不需要了解任何提示,但速度很慢,因为它至少等待两秒钟。另外,我不明白为什么需要第二个计时器(否则完成将不会返回)。

有人知道更好的解决方案吗?

Here's a sub-optimal way that I've found:

use IPC::Run;

my $input = "First Line\n";
my $output;
my $process = IPC::Run::start(['./read.pl'], \$input, \$output);
$process->pump() until $output =~ /Perl read:/;
$input .= "Second Line\n";
$process->finish();
print $output;

It's sub-optimal in the sense that one needs to know the "prompt" that the program will emit before waiting for more input.

Another sub-optimal solution is the following:

use IPC::Run;

my $input = "First Line\n";
my $output;
my $process = IPC::Run::start(['./read.pl'], \$input, my $timer = IPC::Run::timer(1));
$process->pump() until $timer->is_expired();
$timer->start(1);
$input .= "Second Line\n";
$process->finish();

It does not require knowledge of any prompt, but is slow because it waits at least two seconds. Also, I don't understand why the second timer is needed (finish won't return otherwise).

Does anybody know better solutions?

何其悲哀 2024-09-26 02:25:20

最后我得到了以下解决方案。虽然还远未达到最佳状态,但它确实有效。即使在像gbacon描述的情况的情况下也是如此。

use Carp qw( confess );
use IPC::Run;
use Scalar::Util;
use Time::HiRes;

# Invokes the given program with the given input and argv, and returns stdout/stderr.
#
# The first argument provided is the input for the program. It is an arrayref
# containing one or more of the following:
# 
# * A scalar is simply passed to the program as stdin
#
# * An arrayref in the form [ "prompt", "input" ] causes the function to wait
#   until the program prints "prompt", then spools "input" to its stdin
#
# * An arrayref in the form [ 0.3, "input" ] waits 0.3 seconds, then spools
#   "input" to the program's stdin
sub capture_with_input {
    my ($program, $inputs, @argv) = @_;
    my ($stdout, $stderr);
    my $stdin = '';

    my $process = IPC::Run::start( [$program, @argv], \$stdin, \$stdout, \$stderr );
    foreach my $input (@$inputs) {
        if (ref($input) eq '') {
            $stdin .= $input;
        }
        elsif (ref($input) eq 'ARRAY') {
            (scalar @$input == 2) or
                confess "Input to capture_with_input must be of the form ['prompt', 'input'] or [timeout, 'input']!";

            my ($prompt_or_timeout, $text) = @$input;
            if (Scalar::Util::looks_like_number($prompt_or_timeout)) {
                my $start_time = [ Time::HiRes::gettimeofday ];
                $process->pump_nb() while (Time::HiRes::tv_interval($start_time) < $prompt_or_timeout);
            }
            else {
                $prompt_or_timeout = quotemeta $prompt_or_timeout;
                $process->pump until $stdout =~ m/$prompt_or_timeout/gc;
            }

            $stdin .= $text;
        }
        else {
            confess "Unknown input type passed to capture_with_input!";
        }
    }
    $process->finish();

    return ($stdout, $stderr);
}

my $input = [
    "First Line\n",
    ["Perl read:", "Second Line\n"],
    [0.5, "Third Line\n"],
];
print "Executing process...\n";
my ($stdout, $stderr) = capture_with_input('./read.pl', $input);
print "done.\n";
print "STDOUT:\n", $stdout;
print "STDERR:\n", $stderr;

使用示例(稍微修改过的 read.pl 来测试 gbacon 的情况):

$ time ./spool_read4.pl
Executing process...
done.
STDOUT:
Perl read: First Line
And here's what head -n1 gets: Second Line
Perl read again: Third Line

STDERR:
./spool_read4.pl  0.54s user 0.02s system 102% cpu 0.547 total

不过,我愿意接受更好的解决方案......

Finally I ended up with the following solution. Still far from optimal, but it works. Even in situations like the one described by gbacon.

use Carp qw( confess );
use IPC::Run;
use Scalar::Util;
use Time::HiRes;

# Invokes the given program with the given input and argv, and returns stdout/stderr.
#
# The first argument provided is the input for the program. It is an arrayref
# containing one or more of the following:
# 
# * A scalar is simply passed to the program as stdin
#
# * An arrayref in the form [ "prompt", "input" ] causes the function to wait
#   until the program prints "prompt", then spools "input" to its stdin
#
# * An arrayref in the form [ 0.3, "input" ] waits 0.3 seconds, then spools
#   "input" to the program's stdin
sub capture_with_input {
    my ($program, $inputs, @argv) = @_;
    my ($stdout, $stderr);
    my $stdin = '';

    my $process = IPC::Run::start( [$program, @argv], \$stdin, \$stdout, \$stderr );
    foreach my $input (@$inputs) {
        if (ref($input) eq '') {
            $stdin .= $input;
        }
        elsif (ref($input) eq 'ARRAY') {
            (scalar @$input == 2) or
                confess "Input to capture_with_input must be of the form ['prompt', 'input'] or [timeout, 'input']!";

            my ($prompt_or_timeout, $text) = @$input;
            if (Scalar::Util::looks_like_number($prompt_or_timeout)) {
                my $start_time = [ Time::HiRes::gettimeofday ];
                $process->pump_nb() while (Time::HiRes::tv_interval($start_time) < $prompt_or_timeout);
            }
            else {
                $prompt_or_timeout = quotemeta $prompt_or_timeout;
                $process->pump until $stdout =~ m/$prompt_or_timeout/gc;
            }

            $stdin .= $text;
        }
        else {
            confess "Unknown input type passed to capture_with_input!";
        }
    }
    $process->finish();

    return ($stdout, $stderr);
}

my $input = [
    "First Line\n",
    ["Perl read:", "Second Line\n"],
    [0.5, "Third Line\n"],
];
print "Executing process...\n";
my ($stdout, $stderr) = capture_with_input('./read.pl', $input);
print "done.\n";
print "STDOUT:\n", $stdout;
print "STDERR:\n", $stderr;

Usage example (with a slightly modified read.pl to test gbacon's case):

$ time ./spool_read4.pl
Executing process...
done.
STDOUT:
Perl read: First Line
And here's what head -n1 gets: Second Line
Perl read again: Third Line

STDERR:
./spool_read4.pl  0.54s user 0.02s system 102% cpu 0.547 total

Still, I'm open to better solutions...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文