为什么这个正则表达式不会从 Pod::Usage 文本中删除最后的空格？

发布于 2024-11-15 01:12:47 字数 2183 浏览 5 评论 0原文

我正在开发一个依赖 Pod::Usage 来解析调用脚本的 POD 的模块然后将用法、帮助和手册文本发送到标量变量。我需要从该文本中删除最后的空格，因此我使用了一个我认为可行的简单正则表达式。确实如此……但断断续续。

这是问题的演示。任何见解将不胜感激。

在装有 Perl 5.10.1 的 Solaris 计算机上，意外行为（即正则表达式无法删除最后的换行符）始终发生。在装有 Perl 5.12.1 的 Windows 下，行为不稳定（下面提供的输出）。

use strict;
use warnings;

use Pod::Usage qw(pod2usage);
use Test::More;

# Baseline test to show that the regex works.
my $exp                      = "foo\nbar\n...";
my $with_trailing_whitespace = $exp . "   \n\n";
$with_trailing_whitespace    =~ s!\s+\Z!!;
my $ords = get_ords_of_final_chars($with_trailing_whitespace);
is_deeply $ords, [46, 46, 46]; # String ends with 3 periods (not whitespace).

# Run a similar test, using text from Pod::Usage.
for (1 .. 2){
    my $pod = get_pod_text();
    $ords = get_ords_of_final_chars($pod);
    is_deeply $ords, [46, 46, 46];
}

done_testing();

sub get_ords_of_final_chars {
    # Takes a string. Return array ref of the ord() of last 3 characters.
    my $s = shift;
    return [ map ord(substr $s, - $_, 1), 1 .. 3 ];
}

sub get_pod_text {
    # Call pod2usage(), sending output to a scalar.
    open(my $fh, '>', \my $txt) or die $!;
    pod2usage(-verbose => 2, -exitval => 'NOEXIT', -output  => $fh);
    close $fh;   # This doesn't help.

    # Here's the same regex as above.
    # 
    # If I use chomp(), the newlines are consistently removed:
    #     1 while chomp($txt);
    $txt =~ s!\s+\Z!!;
    return $txt; 
}

__END__

=head1 NAME

sample - Some script...

=head1 SYNOPSIS

foo.pl ARGS...

=head1 DESCRIPTION

This program will read the given input file(s) and do something
useful with the contents thereof...

=cut

我的 Windows 盒子上的输出：

$ perl  demo.pl
ok 1
not ok 2
#   Failed test at demo.pl line 18.
#     Structures begin differing at:
#          $got->[0] = '10'
#     $expected->[0] = '46'
not ok 3
#   Failed test at demo.pl line 18.
#     Structures begin differing at:
#          $got->[0] = '10'
#     $expected->[0] = '46'
1..3
# Looks like you failed 2 tests of 3.

$ perl  demo.pl
ok 1
ok 2
ok 3
1..3

原文

I am working on a module that relies on Pod::Usage to parse the calling script's POD and then send usage, help, and man text to a scalar variable. I needed to remove the final whitespace from that text, so I used a simple regex that I thought would work. And it did ... but intermittently.

Here's a demonstration of the problem. Any insights would be appreciated.

The unexpected behavior (i.e., the failure of the regex to remove final newlines) occurs consistently on my Solaris machine with Perl 5.10.1. Under Windows with Perl 5.12.1, the behavior is erratic (output supplied below).

use strict;
use warnings;

use Pod::Usage qw(pod2usage);
use Test::More;

# Baseline test to show that the regex works.
my $exp                      = "foo\nbar\n...";
my $with_trailing_whitespace = $exp . "   \n\n";
$with_trailing_whitespace    =~ s!\s+\Z!!;
my $ords = get_ords_of_final_chars($with_trailing_whitespace);
is_deeply $ords, [46, 46, 46]; # String ends with 3 periods (not whitespace).

# Run a similar test, using text from Pod::Usage.
for (1 .. 2){
    my $pod = get_pod_text();
    $ords = get_ords_of_final_chars($pod);
    is_deeply $ords, [46, 46, 46];
}

done_testing();

sub get_ords_of_final_chars {
    # Takes a string. Return array ref of the ord() of last 3 characters.
    my $s = shift;
    return [ map ord(substr $s, - $_, 1), 1 .. 3 ];
}

sub get_pod_text {
    # Call pod2usage(), sending output to a scalar.
    open(my $fh, '>', \my $txt) or die $!;
    pod2usage(-verbose => 2, -exitval => 'NOEXIT', -output  => $fh);
    close $fh;   # This doesn't help.

    # Here's the same regex as above.
    # 
    # If I use chomp(), the newlines are consistently removed:
    #     1 while chomp($txt);
    $txt =~ s!\s+\Z!!;
    return $txt; 
}

__END__

=head1 NAME

sample - Some script...

=head1 SYNOPSIS

foo.pl ARGS...

=head1 DESCRIPTION

This program will read the given input file(s) and do something
useful with the contents thereof...

=cut

Output on my Windows box:

$ perl  demo.pl
ok 1
not ok 2
#   Failed test at demo.pl line 18.
#     Structures begin differing at:
#          $got->[0] = '10'
#     $expected->[0] = '46'
not ok 3
#   Failed test at demo.pl line 18.
#     Structures begin differing at:
#          $got->[0] = '10'
#     $expected->[0] = '46'
1..3
# Looks like you failed 2 tests of 3.

$ perl  demo.pl
ok 1
ok 2
ok 3
1..3

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

抱着落日 2024-11-22 01:12:47

好吧，引用 perlre ：

\Z Match only at end of string, or before newline at the end
\z Match only at end of string

所以，你应该使用 $txt =~ s!\s+\z!!;（小写z）。

虽然，由于 \s+ 是贪婪的，所以我希望它无论如何都能工作。也许这是一个 Perl 错误。

Well, to quote perlre:

\Z Match only at end of string, or before newline at the end
\z Match only at end of string

So, you should be using $txt =~ s!\s+\z!!; (lower case z).

Although, since \s+ is greedy, I would have expected it to work anyway. Maybe it's a Perl bug.

回复收藏 0 原文

夜未央樱花落 2024-11-22 01:12:47

虽然其他海报关于 \z\Z$ 的说法是正确的，但 fwiw，我在 win32 上没有遇到任何失败

$ perl -d:Modlist demo.pl
ok 1
ok 2
ok 3
1..3
Carp                   1.17
Config
Encode                 2.43
Encode::Alias          2.14
Encode::Config         2.05
Encode::Encoding       2.05
Exporter             5.64_01
Exporter::Heavy      5.64_01
File::Spec             3.33
File::Spec::Unix       3.33
File::Spec::Win32      3.33
PerlIO                 1.06
PerlIO::scalar         0.08
Pod::Escapes           1.04
Pod::InputObjects      1.31
Pod::Parser            1.37
Pod::Select            1.36
Pod::Simple            3.16
Pod::Simple::BlackBox   3.16
Pod::Simple::LinkSection   3.16
Pod::Text              3.15
Pod::Usage             1.36
Test::Builder          0.98
Test::Builder::Module   0.98
Test::More             0.98
XSLoader               0.15
base                   2.15
bytes                  1.04
integer                1.00
overload               1.10
vars                   1.01
warnings               1.09
warnings::register     1.01

While the other posters are correct about \z\Z$, fwiw, I don't get any failures on win32

$ perl -d:Modlist demo.pl
ok 1
ok 2
ok 3
1..3
Carp                   1.17
Config
Encode                 2.43
Encode::Alias          2.14
Encode::Config         2.05
Encode::Encoding       2.05
Exporter             5.64_01
Exporter::Heavy      5.64_01
File::Spec             3.33
File::Spec::Unix       3.33
File::Spec::Win32      3.33
PerlIO                 1.06
PerlIO::scalar         0.08
Pod::Escapes           1.04
Pod::InputObjects      1.31
Pod::Parser            1.37
Pod::Select            1.36
Pod::Simple            3.16
Pod::Simple::BlackBox   3.16
Pod::Simple::LinkSection   3.16
Pod::Text              3.15
Pod::Usage             1.36
Test::Builder          0.98
Test::Builder::Module   0.98
Test::More             0.98
XSLoader               0.15
base                   2.15
bytes                  1.04
integer                1.00
overload               1.10
vars                   1.01
warnings               1.09
warnings::register     1.01

回复收藏 0 原文

~没有更多了~