PERL:通过多行匹配迭代
我希望通过perl中的多行模式进行迭代,但我正在努力使用该语法。
我的输入字符串是:
+++ STAR-WARS 2020-01-01 00:00:00+00:00
S&W #00000000
%%SHOW NAME: Q=Kenobi;%%
RETCODE = 0 Operation success
In-universe information
-----------------------
Species = Human
Gender = Male
television series of num = whatever
(Number of results = 1)
Personal Details
----------------
First Name = Obi-Wan
Last Name = Kenobi
Alias = Padawan
= Jedi Knight
= Jedi General
= Jedi Master
Points to other set of information = whatever
(Number of results = 1)
Other attribute
---------------
Significant other = Satine Kryze
Affiliation = Jedi Order
= Galactic Republic
= Rebel Alliance
Occupation = Jedi
(Number of results = 1)
--- END
我所需的结果是:
$VAR1 = {
'In-universe information' => {
'Gender' => 'Male',
'Species' => 'Human',
'results' => '1',
'television series of num' => 'whatever'
},
'Other attribute' => {
'Affiliation' => [
'Jedi Order',
'Galactic Republic',
'Rebel Alliance'
],
'Occupation' => 'Jedi',
'Significant other' => 'Satine Kryze',
'results' => '1'
},
'Personal Details' => {
'Alias' => [
'Padawan',
'Jedi Knight',
'Jedi General',
'Jedi Master'
],
'First Name' => 'Obi-Wan',
'Last Name' => 'Kenobi',
'Points to other set of information' => 'whatever',
'results' => '1'
},
'code' => '0',
'description' => 'Operation success'
};
我对“单个块”的工作非常有效(例如,上面的个人详细信息 )。但是,如果数据包含多个块,我将无法弄清楚如何通过每个匹配块迭代。 (例如,使用使用 /g
)
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
local $/;
my $output = <DATA>;
my %hash;
($hash{'code'}, $hash{'description'}) = $output =~ /^RETCODE = (\d+)\s+(.*)\n/m;
if ($hash{'code'} eq "0") {
my ($type,$data, $results) = $output =~ /([^\n]+)\n-+\n(.*)\n\n\(Number of results = (\d+)\)\n\n/sm;
my $previousKey = "";
while ($data =~ /(.+)$/mg) {
my $line = $1;
$line =~ s/(?:^ +)//g;
my ($key, $value);
if ($line =~ /^\s*= /) {
($value) = $line =~ /^\s*= (.*)$/;
$hash{$type}{$previousKey} = [ $hash{$type}{$previousKey} ] unless ref($hash{$type}{$previousKey});
push (@{$hash{$type}{$previousKey}}, $value);
} else {
($key, $value) = split(/ = /, $line);
$hash{$type}{$key} = $value;
$previousKey = $key;
}
}
say STDERR Dumper(\%hash);
}
__DATA__
+++ STAR-WARS 2020-01-01 00:00:00+00:00
S&W #00000000
%%SHOW NAME: Q=Kenobi;%%
RETCODE = 0 Operation success
In-universe information
-----------------------
Species = Human
Gender = Male
television series of num = whatever
(Number of results = 1)
Personal Details
----------------
First Name = Obi-Wan
Last Name = Kenobi
Alias = Padawan
= Jedi Knight
= Jedi General
= Jedi Master
Points to other set of information = whatever
(Number of results = 1)
Other attribute
---------------
Significant other = Satine Kryze
Affiliation = Jedi Order
= Galactic Republic
= Rebel Alliance
Occupation = Jedi
(Number of results = 1)
--- END
几个事实:
- 每个“块”总是包含一个标头,其次是newline和dashes等于标头的长度。
- 每个“块”总是以
\ n
结束,然后是(结果= \ d+)
,然后是\ n
。 - 每个键/ value 对在相等符号之前和之后总是有两个空格。 IE
/=/
- 在没有键时,假设是[array],并将 value 附加到上一个键。例如,在上面的示例中,例如
别名
。 - 字符串将始终以
---结束
结尾,然后是\ n
I would like iterate through a multiline pattern in Perl, but I'm struggling with the syntax.
My input string is:
+++ STAR-WARS 2020-01-01 00:00:00+00:00
S&W #00000000
%%SHOW NAME: Q=Kenobi;%%
RETCODE = 0 Operation success
In-universe information
-----------------------
Species = Human
Gender = Male
television series of num = whatever
(Number of results = 1)
Personal Details
----------------
First Name = Obi-Wan
Last Name = Kenobi
Alias = Padawan
= Jedi Knight
= Jedi General
= Jedi Master
Points to other set of information = whatever
(Number of results = 1)
Other attribute
---------------
Significant other = Satine Kryze
Affiliation = Jedi Order
= Galactic Republic
= Rebel Alliance
Occupation = Jedi
(Number of results = 1)
--- END
My desired resulting hash would be:
$VAR1 = {
'In-universe information' => {
'Gender' => 'Male',
'Species' => 'Human',
'results' => '1',
'television series of num' => 'whatever'
},
'Other attribute' => {
'Affiliation' => [
'Jedi Order',
'Galactic Republic',
'Rebel Alliance'
],
'Occupation' => 'Jedi',
'Significant other' => 'Satine Kryze',
'results' => '1'
},
'Personal Details' => {
'Alias' => [
'Padawan',
'Jedi Knight',
'Jedi General',
'Jedi Master'
],
'First Name' => 'Obi-Wan',
'Last Name' => 'Kenobi',
'Points to other set of information' => 'whatever',
'results' => '1'
},
'code' => '0',
'description' => 'Operation success'
};
What I have come up with works well for a "single block" (e.g. Personal Details
above). However, if the data contains multiple blocks, I can't figure out how to iterate through every matching block. (e.g. use while
with /g
)
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
local $/;
my $output = <DATA>;
my %hash;
($hash{'code'}, $hash{'description'}) = $output =~ /^RETCODE = (\d+)\s+(.*)\n/m;
if ($hash{'code'} eq "0") {
my ($type,$data, $results) = $output =~ /([^\n]+)\n-+\n(.*)\n\n\(Number of results = (\d+)\)\n\n/sm;
my $previousKey = "";
while ($data =~ /(.+)$/mg) {
my $line = $1;
$line =~ s/(?:^ +)//g;
my ($key, $value);
if ($line =~ /^\s*= /) {
($value) = $line =~ /^\s*= (.*)$/;
$hash{$type}{$previousKey} = [ $hash{$type}{$previousKey} ] unless ref($hash{$type}{$previousKey});
push (@{$hash{$type}{$previousKey}}, $value);
} else {
($key, $value) = split(/ = /, $line);
$hash{$type}{$key} = $value;
$previousKey = $key;
}
}
say STDERR Dumper(\%hash);
}
__DATA__
+++ STAR-WARS 2020-01-01 00:00:00+00:00
S&W #00000000
%%SHOW NAME: Q=Kenobi;%%
RETCODE = 0 Operation success
In-universe information
-----------------------
Species = Human
Gender = Male
television series of num = whatever
(Number of results = 1)
Personal Details
----------------
First Name = Obi-Wan
Last Name = Kenobi
Alias = Padawan
= Jedi Knight
= Jedi General
= Jedi Master
Points to other set of information = whatever
(Number of results = 1)
Other attribute
---------------
Significant other = Satine Kryze
Affiliation = Jedi Order
= Galactic Republic
= Rebel Alliance
Occupation = Jedi
(Number of results = 1)
--- END
Few facts:
- every "block" always contains a header, followed by newline and dashes equal to the length of the header.
- every "block" always ends with
\n
, followed by(Number of results = \d+)
, followed by\n
. - each key/value pair always have two spaces before and after the equal sign. i.e.
/ = /
- when no key exists, assume it's an [array], and append the value to the previous key. e.g.
Alias
in my example above. - the string will always ends with
--- END
followed by a\n
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据您的说明,该部分以
+++ ...
开始,并以--- END
结束。基于此信息,可以用 Regex 将输入传播到感兴趣的块中,然后在循环中单独处理以构建哈希。
注意:解析器进行了稍微修改,并将其放入子例程
输出中
According your description the section is starting with
+++ ...
and ending with--- END
.Based on this information the input can be devided with regex into blocks of interest which then processed individually in a loop with a parser to build a hash.
NOTE: the parser was slightly modified and put into subroutine
Output