如何将 Perl 正则表达式的捕获存储到单独的变量中?

发布于 2024-08-21 15:01:20 字数 242 浏览 8 评论 0原文

我有一个正则表达式:

/abc(def)ghi(jkl)mno(pqr)/igs

如何将每个括号的结果捕获到 3 个不同的变量中,每个括号一个?现在我使用一个数组来捕获所有结果,它们是按顺序出现的,但随后我必须解析它们,并且列表可能会很大。

@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);

I have a regex:

/abc(def)ghi(jkl)mno(pqr)/igs

How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, they come out sequential but then I have to parse them and the list could be huge.

@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

烟雨扶苏 2024-08-28 15:01:20

你的问题对我来说有点模棱两可,但我认为你想做这样的事情:

my (@first, @second, @third);
while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    push @first, $first;
    push @second, $second;
    push @third, $third;
}

Your question is a bit ambiguous to me, but I think you want to do something like this:

my (@first, @second, @third);
while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    push @first, $first;
    push @second, $second;
    push @third, $third;
}
非要怀念 2024-08-28 15:01:20

从 5.10 开始,您也可以使用命名捕获缓冲区

#!/usr/bin/perl

use strict; use warnings;

my %data;

my $s = 'abcdefghijklmnopqr';

if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
    push @{ $data{$_} }, $+{$_} for keys %+;
}

use Data::Dumper;
print Dumper \%data;

输出:

$VAR1 = {
          'first' => [
                       'def'
                     ],
          'second' => [
                        'jkl'
                      ],
          'third' => [
                       'pqr'
                     ]
        };

对于早期版本版本中,您可以使用以下内容,从而避免为每个捕获的缓冲区添加一行:

#!/usr/bin/perl

use strict; use warnings;

my $s = 'abcdefghijklmnopqr';

my @arrays = \ my(@first, @second, @third);

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
}

use Data::Dumper;
print Dumper @arrays;

输出:

$VAR1 = [
          'def'
        ];
$VAR2 = [
          'jkl'
        ];
$VAR3 = [
          'pqr'
        ];

但我喜欢将相关数据保留在单个数据结构中,因此最好重新使用哈希。然而,这确实需要一个辅助数组:

my %data;
my @keys = qw( first second third );

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
}

或者,如果变量的名称确实是 firstsecond 等,或者缓冲区的名称并不重要,但只有订单可以,您可以使用:

my @data;
if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;
}

Starting with 5.10, you can use named capture buffers as well:

#!/usr/bin/perl

use strict; use warnings;

my %data;

my $s = 'abcdefghijklmnopqr';

if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
    push @{ $data{$_} }, $+{$_} for keys %+;
}

use Data::Dumper;
print Dumper \%data;

Output:

$VAR1 = {
          'first' => [
                       'def'
                     ],
          'second' => [
                        'jkl'
                      ],
          'third' => [
                       'pqr'
                     ]
        };

For earlier versions, you can use the following which avoids having to add a line for each captured buffer:

#!/usr/bin/perl

use strict; use warnings;

my $s = 'abcdefghijklmnopqr';

my @arrays = \ my(@first, @second, @third);

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
}

use Data::Dumper;
print Dumper @arrays;

Output:

$VAR1 = [
          'def'
        ];
$VAR2 = [
          'jkl'
        ];
$VAR3 = [
          'pqr'
        ];

But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:

my %data;
my @keys = qw( first second third );

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
}

Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:

my @data;
if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;
}
仅此而已 2024-08-28 15:01:20

另一种方法看起来像 Ghostdog74 的答案,但使用存储哈希引用的数组:

my @results;
while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    my ($key1, $key2, $key3) = ($1, $2, $3);
    push @results, { 
        key1 => $key1,
        key2 => $key2,
        key3 => $key3,
    };
}

# do something with it

foreach my $result (@results) {
    print "$result->{key1}, $result->{key2}, $result->{key3}\n";
}

这里的主要优点是使用单个数据结构,并且具有良好的可读循环。

An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:

my @results;
while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    my ($key1, $key2, $key3) = ($1, $2, $3);
    push @results, { 
        key1 => $key1,
        key2 => $key2,
        key3 => $key3,
    };
}

# do something with it

foreach my $result (@results) {
    print "$result->{key1}, $result->{key2}, $result->{key3}\n";
}

with the main advantage here of using a single data structure, AND having a nice readable loop.

反话 2024-08-28 15:01:20

@OP,当捕获括号时,您可以使用变量 $1,$2....这些是反向引用

$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";
while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {
    print "$1 $2 $3\n";
}

输出

$ perl perl.pl
def jkl pqr
def jkl pqr

@OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences

$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";
while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {
    print "$1 $2 $3\n";
}

output

$ perl perl.pl
def jkl pqr
def jkl pqr
请帮我爱他 2024-08-28 15:01:20

您可以拥有三个不同的正则表达式,每个正则表达式专注于特定的组。显然,您希望将不同的组分配给正则表达式中的不同数组,但我认为您唯一的选择是将正则表达式分开。

You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.

平安喜乐 2024-08-28 15:01:20

您可以编写包含命名捕获组的正则表达式。您可以在捕获组开头使用 ? 构造来执行此操作:

/(?<myvar>[0-9]+)/

然后您可以使用 $+{myvar} 形式引用那些命名的捕获组。

这是一个人为的示例:

perl -ne '/^systemd-(?<myvar>[^:]+)/ && { print $+{myvar} . "\n"}' /etc/passwd

给定一个典型的密码文件,它会提取 systemd 用户并返回减去 systemd 前缀的名称。它使用名为 myvar 的捕获组。这只是一个示例,用于说明捕获组变量的使用。

You can write a regex containing named capture groups. You do this with the ?<myvar> construct at the beginning of the capture group:

/(?<myvar>[0-9]+)/

You may then refer to those named capture groups using a $+{myvar} form.

Here is a contrived example:

perl -ne '/^systemd-(?<myvar>[^:]+)/ && { print $+{myvar} . "\n"}' /etc/passwd

Given a typical password file, it pulls out the systemd users and returns the names less the systemd prefix. It uses a capture group named myvar. This is just an example thrown together to illustrate the use of capture group variables.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文