计算 qr 正则表达式中的捕获组数量?

发布于 2024-12-23 13:34:10 字数 771 浏览 4 评论 0原文

我正在开发一个项目,该项目有时会从 ftp 服务器获取文件列表。此时,它要么返回文件的 arrayref,要么如果传递一个可选的正则表达式引用(即 qr),它会使用 grep 过滤列表。此外,如果该 qr 具有捕获组,它会将捕获的部分视为版本号,并返回一个 hashref,其中键是版本,值是文件名(将返回为如果没有捕获组,则为数组)。代码看起来像(稍微简化)

sub filter_files {
  my ($files, $pattern) = @_;
  my @files = @$files;
  unless ($pattern) {
    return \@files;
  }

  @files = grep { $_ =~ $pattern } @files;
  carp "Could not find any matching files" unless @files;

  my %versions = 
    map { 
      if ($_ =~ $pattern and defined $1) { 
        ( $1 => $_ )
      } else {
        ()
      }
    } 
    @files;

  if (scalar keys %versions) {
    return \%versions;
  } else {
    return \@files;
  }
}

此实现尝试创建哈希并在成功时返回它。我的问题是,我是否可以检测到 qr 具有捕获组,并且只有在有捕获组时才尝试创建哈希?

I am working on a project which at one point gets a list of files from an ftp server. At that point it either returns an arrayref of files OR if an optional regex reference (i.e. qr), is passed it filters the list down using grep. Further if that qr has a capture group, it treats the captured section as a version number and returns instead a hashref where the keys are the versions and the values are the file names (which would have been returned as the array if no capture groups). The code looks like (simplified slightly)

sub filter_files {
  my ($files, $pattern) = @_;
  my @files = @$files;
  unless ($pattern) {
    return \@files;
  }

  @files = grep { $_ =~ $pattern } @files;
  carp "Could not find any matching files" unless @files;

  my %versions = 
    map { 
      if ($_ =~ $pattern and defined $1) { 
        ( $1 => $_ )
      } else {
        ()
      }
    } 
    @files;

  if (scalar keys %versions) {
    return \%versions;
  } else {
    return \@files;
  }
}

This implementation tries to create the hash and returns it if it succeeds. My question, is can I detect that the qr has a capture group and only attempt to create the hash if it does?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

颜漓半夏 2024-12-30 13:34:10

你可以使用类似的东西:

sub capturing_groups{
    my $re = shift;
    "" =~ /|$re/;
    return $#+;
}

say capturing_groups qr/fo(.)b(..)/;

输出:

2

You could use something like:

sub capturing_groups{
    my $re = shift;
    "" =~ /|$re/;
    return $#+;
}

say capturing_groups qr/fo(.)b(..)/;

Output:

2
嗫嚅 2024-12-30 13:34:10

请参阅 Regexp::Parser 中的 nparen

use strictures;
use Carp qw(carp);
use Regexp::Parser qw();
my $parser = Regexp::Parser->new;

sub filter_files {
    my ($files, $pattern) = @_;
    my @files = @$files;
    return \@files unless $pattern;

    carp sprintf('Could not inspect regex "%s": %s (%d)',
        $pattern, $parser->errmsg, $parser->errnum)
        unless $parser->regex($pattern);

    my %versions;
    @files = map {
        if (my ($capture) = $_ =~ $pattern) {
            $parser->nparen
                ? push @{ $versions{$capture} }, $_
                : $_
        } else {
            ()
        }
    } @files;
    carp 'Could not find any matching files' unless @files;

    return (scalar keys %versions)
        ? \%versions
        : \@files;
}

避免检查模式的另一种可能性是简单地依赖 $capture 的值。如果成功匹配但未捕获,则为 1(Perl 真值)。您可以将其与偶尔返回 1 的捕获区分开来,因为该捕获缺少 IV 标志。

See nparen in Regexp::Parser.

use strictures;
use Carp qw(carp);
use Regexp::Parser qw();
my $parser = Regexp::Parser->new;

sub filter_files {
    my ($files, $pattern) = @_;
    my @files = @$files;
    return \@files unless $pattern;

    carp sprintf('Could not inspect regex "%s": %s (%d)',
        $pattern, $parser->errmsg, $parser->errnum)
        unless $parser->regex($pattern);

    my %versions;
    @files = map {
        if (my ($capture) = $_ =~ $pattern) {
            $parser->nparen
                ? push @{ $versions{$capture} }, $_
                : $_
        } else {
            ()
        }
    } @files;
    carp 'Could not find any matching files' unless @files;

    return (scalar keys %versions)
        ? \%versions
        : \@files;
}

Another possibility to avoid inspecting the pattern is to simply rely on the value of $capture. It will be 1 (Perl true value) in the case of a successful match without capture. You can distinguish it from the occasional capture returning 1 because that one lack the IV flag.

︶ ̄淡然 2024-12-30 13:34:10

您可以使用 YAPE::Regex 来解析正则表达式,看看是否有捕获存在:

use warnings;
use strict;
use YAPE::Regex;

filter_files(qr/foo.*/);
filter_files(qr/(foo).*/);

sub filter_files {
    my ($pattern) = @_;
    print "$pattern ";
    if (has_capture($pattern)) {
        print "yes capture\n";
    }
    else {
        print "no capture\n";
    }
}

sub has_capture {
    my ($pattern) = @_;
    my $cap = 0;
    my $p = YAPE::Regex->new($pattern);
    while ($p->next()) {
        if (scalar @{ $p->{CAPTURE} }) {
            $cap = 1;
            last;
        }
    }
    return $cap;
}

__END__

(?-xism:foo.*) no capture
(?-xism:(foo).*) yes capture

You could use YAPE::Regex to parse the regular expression to see if there is a capture present:

use warnings;
use strict;
use YAPE::Regex;

filter_files(qr/foo.*/);
filter_files(qr/(foo).*/);

sub filter_files {
    my ($pattern) = @_;
    print "$pattern ";
    if (has_capture($pattern)) {
        print "yes capture\n";
    }
    else {
        print "no capture\n";
    }
}

sub has_capture {
    my ($pattern) = @_;
    my $cap = 0;
    my $p = YAPE::Regex->new($pattern);
    while ($p->next()) {
        if (scalar @{ $p->{CAPTURE} }) {
            $cap = 1;
            last;
        }
    }
    return $cap;
}

__END__

(?-xism:foo.*) no capture
(?-xism:(foo).*) yes capture
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文