如何匹配Perl中的多个项目

发布于 2025-02-06 15:06:36 字数 807 浏览 2 评论 0原文

my $text ='<span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small>'


if ($text =~ m/<span>by <small class="author" itemprop="author">(.+?)<\/small>/ig){
    $author = $1;
    $authorcount{$author} +=1;
}

$authorcounttxt = "authorcount.txt";
open (OUTPUT3, ">$authorcounttxt");
foreach $author (sort { $authorcount{$b} <=> $authorcount{$a} } keys %authorcount){
    print OUTPUT3 ("$author\t\t$authorcount{$author}\n");
}
close (OUTPUT3);

所需的输出是:

J.K. Rowling 3

但是我只得到:

J.K. Rowling 1
my $text ='<span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small>'


if ($text =~ m/<span>by <small class="author" itemprop="author">(.+?)<\/small>/ig){
    $author = $1;
    $authorcount{$author} +=1;
}

$authorcounttxt = "authorcount.txt";
open (OUTPUT3, ">$authorcounttxt");
foreach $author (sort { $authorcount{$b} <=> $authorcount{$a} } keys %authorcount){
    print OUTPUT3 ("$author\t\t$authorcount{$author}\n");
}
close (OUTPUT3);

The desired output is:

J.K. Rowling 3

However I am only getting:

J.K. Rowling 1

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

碍人泪离人颜 2025-02-13 15:06:36
  if($ text = 〜m /.../ ig){
     $ wuter = $ 1;
     $ WERPONT {$ rution} += 1;
 

如果语句,这是一个,这意味着最多一次输入内部块,即如果有第一个匹配。您可能打算做一个,而语句则输入每个匹配的内部块:

  while($ text = 〜m /..../ ig){
     $ wuter = $ 1;
     $ WERPONT {$ rution} += 1;
 
if ($text =~ m/.../ig){
     $author = $1;
     $authorcount{$author} +=1;

This is an if statement which means that the inner block while be entered at most once, i.e. if there is a first match. You likely meant to do a while statement to enter the inner block for each match:

while ($text =~ m/.../ig){
     $author = $1;
     $authorcount{$author} +=1;
阳光的暖冬 2025-02-13 15:06:36

如果 ,> 替换,以迭代以下等级匹配的所有匹配项,而不仅仅是第一个匹配项:

while ($text =~ m/<span>by <small class="author" itemprop="author">(.+?)<\/small>/ig){
  $author = $1;
  $authorcount{$author} += 1;
}

也要说明:与Regexen解析HTML充满危险。考虑使用可以正确解析html的模块, mojo :: dom 例如。

Replace your if with a while to iterate through all of the matches of your regex match instead of only the first one:

while ($text =~ m/<span>by <small class="author" itemprop="author">(.+?)<\/small>/ig){
  $author = $1;
  $authorcount{$author} += 1;
}

Also obligatory note: parsing HTML with regexen is fraught with peril. Consider using a module that can properly parse HTML, Mojo::DOM for example.

凉城凉梦凉人心 2025-02-13 15:06:36

如先前的海报所指示的那样,如果($ text =〜/.../gi)隐藏在中的问题,它将评估为true,并且仅执行一次。

您正在寻找在数组上下文中处理匹配,可以用或循环或来实现。

以下代码片段演示了解决方案的众多方法之一。

use strict;
use warnings;
use feature 'say';

my(%authors, $fname, $text, $re);

$fname = 'authorcount.txt';
$text  = '<span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small>';
$re    = qr/<span>by <small class="author" itemprop="author">(.*?)<\/small>/;

$authors{$1}++ for $text =~ /$re/gi;

open my $fh, ">", $fname
    or die "Can't open $fname";
    
say $fh "$_ $authors{$_}" for sort keys %authors;

close $fh;

注意:此代码将适用于您的示例$ text ='...',如果您打算处理复杂html文件,则 mojo :: dom 是解决问题的正确工具。

As already indicated by previous posters the issue hidden in if ( $text =~ /.../gi ), it evaluates to true and block executed only once.

You are looking to process match in an array context which can be achieved with for or while loop.

Following code snippet demonstrates one of many approaches to the solution.

use strict;
use warnings;
use feature 'say';

my(%authors, $fname, $text, $re);

$fname = 'authorcount.txt';
$text  = '<span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small><span>by <small class="author" itemprop="author">J.K. Rowling</small>';
$re    = qr/<span>by <small class="author" itemprop="author">(.*?)<\/small>/;

$authors{$1}++ for $text =~ /$re/gi;

open my $fh, ">", $fname
    or die "Can't open $fname";
    
say $fh "$_ $authors{$_}" for sort keys %authors;

close $fh;

NOTE: this code will work for your example $text = '...', if you intend to process complex HTML files then Mojo::DOM is a right tool to a problem.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文