Perl regex：如何知道匹配数

发布于 2024-09-16 21:09:02 字数 354 浏览 7 评论 0原文

我循环遍历一系列正则表达式并将其与文件中的行进行匹配，如下所示：

for my $regex (@{$regexs_ref}) {
    LINE: for (@rawfile) {
        /@$regex/ && do {
            # do something here
            next LINE;
        };
    }
}

有没有办法让我知道我有多少匹配项（以便我可以相应地处理它......）？

如果不是，也许这是错误的方法..？当然，我可以为每个正则表达式编写一个配方，而不是循环遍历每个正则表达式。但我不知道最好的做法是什么？

原文

I'm looping through a series of regexes and matching it against lines in a file, like this:

for my $regex (@{$regexs_ref}) {
    LINE: for (@rawfile) {
        /@$regex/ && do {
            # do something here
            next LINE;
        };
    }
}

Is there a way for me to know how many matches I've got (so I can process it accordingly..)?

If not maybe this is the wrong approach..? Of course, instead of looping through every regex, I could just write one recipe for each regex. But I don't know what's the best practice?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏夜暖风 2024-09-23 21:09:02

如果您在列表上下文中进行匹配（即基本上分配给列表），您将在列表中获得所有匹配和分组。然后您可以在标量上下文中使用该列表来获取匹配项的数量。

或者我误解了这个问题？

例子：

my @list = /$my_regex/g;
if (@list)
{
  # do stuff
  print "Number of matches: " . scalar @list . "\n";
}

If you do your matching in list context (i.e., basically assigning to a list), you get all of your matches and groupings in a list. Then you can just use that list in scalar context to get the number of matches.

Or am I misunderstanding the question?

Example:

my @list = /$my_regex/g;
if (@list)
{
  # do stuff
  print "Number of matches: " . scalar @list . "\n";
}

回复收藏 0 原文

喜爱皱眉﹌ 2024-09-23 21:09:02

您需要自己跟踪这一点。这是一种方法：

#!/usr/bin/perl

use strict;
use warnings;

my @regexes = (
    qr/b/,
    qr/a/,
    qr/foo/,
    qr/quux/,
);

my %matches = map { $_ => 0 } @regexes;
while (my $line = <DATA>) {
    for my $regex (@regexes) {
        next unless $line =~ /$regex/;
        $matches{$regex}++;
    }
}

for my $regex (@regexes) {
    print "$regex matched $matches{$regex} times\n";
}

__DATA__
foo
bar
baz

You will need to keep track of that yourself. Here is one way to do it:

#!/usr/bin/perl

use strict;
use warnings;

my @regexes = (
    qr/b/,
    qr/a/,
    qr/foo/,
    qr/quux/,
);

my %matches = map { $_ => 0 } @regexes;
while (my $line = <DATA>) {
    for my $regex (@regexes) {
        next unless $line =~ /$regex/;
        $matches{$regex}++;
    }
}

for my $regex (@regexes) {
    print "$regex matched $matches{$regex} times\n";
}

__DATA__
foo
bar
baz

回复收藏 0 原文

还如梦归 2024-09-23 21:09:02

在 CA::Parser 的处理中与 /$CA::Regex::Parser{Kills}{all}/ 的匹配相关联，您使用捕获 $1 一直到 $10< /code>，其余大多数使用较少。如果匹配数指的是捕获数（$n 具有值的最高 n），则可以使用 Perl 的特殊 @-< /code> 数组（强调已添加）：

@LAST_MATCH_START
@-
$-[0] 是最后一次成功匹配的开始位置的偏移量。 $-[n] 是与第 n 个子模式匹配的子字符串开头的偏移量，如果子模式不匹配，则为 undef 。
因此，在与 $_ 匹配后，$& 与 substr $_, $-[0], $+[0] - $-[0 ]。同样，$n 与
一致
子字符串 $_, $-[n], $+[n] - $-[n]
如果定义了$-[n]，并且$+与
一致
substr $_, $-[$#-], $+[$#-] - $-[$#-]
可以使用$#-来查找最后一次成功匹配中的最后一个匹配的子组。与$#+对比，数字正则表达式中的子组。与@+比较。
此数组保存当前活动动态范围中最后成功子匹配的开头偏移量。 $-[0] 是整个匹配开始的字符串偏移量。该数组的第 n 个元素保存第 n 个子匹配的偏移量，因此 $-[1] 是 $1 开始的偏移量， $-[2] $2 开始处的偏移量，依此类推。
与某个变量 $var 匹配后：
$` 与 substr($var, 0, $-[0]) 相同
$& 与 substr($var, $-[0], $+[0] - $-[0]) 相同
$' 与 substr($var, $+[0])
相同
$1 与 substr($var, $-[1], $+[1] - $-[1])
相同
$2 与 substr($var, $-[2], $+[2] - $-[2])
相同
$3 与 substr($var, $-[3], $+[3] - $-[3])
相同

用法示例：

#! /usr/bin/perl

use warnings;
use strict;

my @patterns = (
  qr/(foo(bar(baz)))/,
  qr/(quux)/,
);

chomp(my @rawfile = <DATA>);

foreach my $pattern (@patterns) {
  LINE: for (@rawfile) {
    /$pattern/ && do {
      my $captures = $#-;
      my $s = $captures == 1 ? "" : "s";
      print "$_: got $captures capture$s\n"; 
    };
  }
}

__DATA__
quux quux quux
foobarbaz

输出：

foobarbaz: got 3 captures
quux quux quux: got 1 capture

In CA::Parser's processing associated with matches for /$CA::Regex::Parser{Kills}{all}/, you're using captures $1 all the way through $10, and most of the rest use fewer. If by the number of matches you mean the number of captures (the highest n for which $n has a value), you could use Perl's special @- array (emphasis added):

@LAST_MATCH_START
@-
$-[0] is the offset of the start of the last successful match. $-[n] is the offset of the start of the substring matched by n-th subpattern, or undef if the subpattern did not match.
Thus after a match against $_, $& coincides with substr $_, $-[0], $+[0] - $-[0]. Similarly, $n coincides with
substr $_, $-[n], $+[n] - $-[n]
if $-[n] is defined, and $+ coincides with
substr $_, $-[$#-], $+[$#-] - $-[$#-]
One can use $#- to find the last matched subgroup in the last successful match. Contrast with $#+, the number of subgroups in the regular expression. Compare with @+.
This array holds the offsets of the beginnings of the last successful submatches in the currently active dynamic scope. $-[0] is the offset into the string of the beginning of the entire match. The n-th element of this array holds the offset of the nth submatch, so $-[1] is the offset where $1 begins, $-[2] the offset where $2 begins, and so on.
After a match against some variable $var:
$` is the same as substr($var, 0, $-[0])
$& is the same as substr($var, $-[0], $+[0] - $-[0])
$' is the same as substr($var, $+[0])
$1 is the same as substr($var, $-[1], $+[1] - $-[1])
$2 is the same as substr($var, $-[2], $+[2] - $-[2])
$3 is the same as substr($var, $-[3], $+[3] - $-[3])

Example usage:

#! /usr/bin/perl

use warnings;
use strict;

my @patterns = (
  qr/(foo(bar(baz)))/,
  qr/(quux)/,
);

chomp(my @rawfile = <DATA>);

foreach my $pattern (@patterns) {
  LINE: for (@rawfile) {
    /$pattern/ && do {
      my $captures = $#-;
      my $s = $captures == 1 ? "" : "s";
      print "$_: got $captures capture$s\n"; 
    };
  }
}

__DATA__
quux quux quux
foobarbaz

Output:

foobarbaz: got 3 captures
quux quux quux: got 1 capture

回复收藏 0 原文

失退 2024-09-23 21:09:02

下面的代码怎么样：

my $string = "12345yx67hjui89";    
my $count = () = $string =~ /\d/g;
print "$count\n";

它按预期打印 9。

How about below code:

my $string = "12345yx67hjui89";    
my $count = () = $string =~ /\d/g;
print "$count\n";

It prints 9 here as expected.

回复收藏 0 原文

~没有更多了~

关于作者

寄与心

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

Perl regex：如何知道匹配数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

@LAST_MATCH_START

@-

@LAST_MATCH_START

@-

关于作者

相关话题

热门标签

推荐作者

烙印

singlesman

给自己一个微笑

独孤求败

晨钟暮鼓

我是自愿种绣球花的

友情链接

Perl regex：如何知道匹配数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

@LAST_MATCH_START

@-

@LAST_MATCH_START

@-

关于作者

相关话题

热门标签

推荐作者

烙印

singlesman

给自己一个微笑

独孤求败

晨钟暮鼓

我是自愿种绣球花的

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。