如何匹配Perl中的多个项目
my $text ='<span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small>'
if ($text =~ m/<span>by <small class="author" itemprop="author">(.+?)<\/small>/ig){
$author = $1;
$authorcount{$author} +=1;
}
$authorcounttxt = "authorcount.txt";
open (OUTPUT3, ">$authorcounttxt");
foreach $author (sort { $authorcount{$b} <=> $authorcount{$a} } keys %authorcount){
print OUTPUT3 ("$author\t\t$authorcount{$author}\n");
}
close (OUTPUT3);
所需的输出是:
J.K. Rowling 3
但是我只得到:
J.K. Rowling 1
my $text ='<span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small>'
if ($text =~ m/<span>by <small class="author" itemprop="author">(.+?)<\/small>/ig){
$author = $1;
$authorcount{$author} +=1;
}
$authorcounttxt = "authorcount.txt";
open (OUTPUT3, ">$authorcounttxt");
foreach $author (sort { $authorcount{$b} <=> $authorcount{$a} } keys %authorcount){
print OUTPUT3 ("$author\t\t$authorcount{$author}\n");
}
close (OUTPUT3);
The desired output is:
J.K. Rowling 3
However I am only getting:
J.K. Rowling 1
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果语句,这是一个,这意味着最多一次输入内部块,即如果有第一个匹配。您可能打算做一个,而语句则输入每个匹配的内部块:
This is an if statement which means that the inner block while be entered at most once, i.e. if there is a first match. You likely meant to do a while statement to enter the inner block for each match:
如果 ,> 替换,以迭代以下等级匹配的所有匹配项,而不仅仅是第一个匹配项:
也要说明:与Regexen解析HTML充满危险。考虑使用可以正确解析html的模块, mojo :: dom 例如。
Replace your
if
with awhile
to iterate through all of the matches of your regex match instead of only the first one:Also obligatory note: parsing HTML with regexen is fraught with peril. Consider using a module that can properly parse HTML, Mojo::DOM for example.
如先前的海报所指示的那样,如果($ text =〜/.../gi)隐藏在
中的问题,它将评估为
true
,并且仅执行一次。您正在寻找在数组上下文中处理匹配,可以用或循环或
来实现。
以下代码片段演示了解决方案的众多方法之一。
注意:此代码将适用于您的示例
$ text ='...'
,如果您打算处理复杂html
文件,则 mojo :: dom 是解决问题的正确工具。As already indicated by previous posters the issue hidden in
if ( $text =~ /.../gi )
, it evaluates totrue
and block executed only once.You are looking to process match in an array context which can be achieved with
for
orwhile
loop.Following code snippet demonstrates one of many approaches to the solution.
NOTE: this code will work for your example
$text = '...'
, if you intend to process complexHTML
files then Mojo::DOM is a right tool to a problem.