统计词频然后排序

发布于 2024-12-14 13:16:05 字数 913 浏览 2 评论 0原文

我正在编写一个 perl 脚本，其中 a 应该处理文本，然后向字典提供单词频率，然后对字典进行排序。该文本摘自埃德加·坡的《Golden Bug》，目的是计算所有单词的频率。但我做错了，因为我没有得到输出。我什么时候做错事了？谢谢。

open(TEXT, "goldenbug.txt") or die("File not found");
while(<TEXT>)
{
chomp;
$_=lc;
s/--/ /g;
s/ +/ /g;
s/[.,:;?"()]//g;

@word=split(/ /);
foreach $word (@words)
    {
        if( /(\w+)'\W/ )
        {
            if($1 eq 'bug')
            {
                $word=~s/'//g;
            }
        }
        if( /\W'(\w+)/)
        {
            if(($1 ne 'change') and ($1 ne 'em') and ($1 ne 'prentices'))
            {
                $word=~s/'//g;
            }
        }

        $dictionary{$word}+=1;
    }
}

foreach $word(sort byDescendingValues keys %dictionary)
{
print "$word, $dictionary{$word}\n";
}

sub byDescendingValues
{
$value=$dictionaty{$b} <=> $dictionary{$a};
if ($value==0)
{
return $a cmp $b
}
else
{
    return $value;
}
}

原文

I'm writing a perl script where a should process the text and then provide the dictionary with word frequences and then sort the dictionary. The text is an extract from "Golden Bug" by Edgar Poe and the purpose is to calculate frequences of all of the words. But I do smth wrong because I get no output. When am I doing wrong? Thanks.

open(TEXT, "goldenbug.txt") or die("File not found");
while(<TEXT>)
{
chomp;
$_=lc;
s/--/ /g;
s/ +/ /g;
s/[.,:;?"()]//g;

@word=split(/ /);
foreach $word (@words)
    {
        if( /(\w+)'\W/ )
        {
            if($1 eq 'bug')
            {
                $word=~s/'//g;
            }
        }
        if( /\W'(\w+)/)
        {
            if(($1 ne 'change') and ($1 ne 'em') and ($1 ne 'prentices'))
            {
                $word=~s/'//g;
            }
        }

        $dictionary{$word}+=1;
    }
}

foreach $word(sort byDescendingValues keys %dictionary)
{
print "$word, $dictionary{$word}\n";
}

sub byDescendingValues
{
$value=$dictionaty{$b} <=> $dictionary{$a};
if ($value==0)
{
return $a cmp $b
}
else
{
    return $value;
}
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

迷爱 2024-12-21 13:16:05

您的代码中有：

@word=split(/ /);
foreach $word (@words)
    {

您在拆分期间将数组命名为 @word ，但在 for 循环中使用数组 @words 。

@word=split(/ /);

应该是

@words=split(/ /);

byDescendingValues 例程中的另一个拼写错误：

$value=$dictionaty{$b} <=> $dictionary{$a};
                ^^

正如其他答案中所建议的，您确实应该添加

use strict;
use warnings;

使用这些，您可以轻松地捕获这些拼写错误。没有他们，你会浪费很多时间。

You have in your code:

@word=split(/ /);
foreach $word (@words)
    {

You've named the array as @word during the split but you are using the array @words in the for loop.

@word=split(/ /);

should be

@words=split(/ /);

Another typo in the byDescendingValues routine:

$value=$dictionaty{$b} <=> $dictionary{$a};
                ^^

As suggested in other answer, you really should add

use strict;
use warnings;

Using these you could have easily caught these typos. Without them you'll be wasting lot of your time.

回复收藏 0 原文

死开点丶别碍眼 2024-12-21 13:16:05

除了混淆 @word 和 @words 之外，您还使用 $dictionaty 而不是 $dictionary。明智的做法是

use strict;
use warnings;

在程序开始时使用 my 声明所有变量。这样，像这样的小错误就可以由 Perl 本身修复。

As well as confusing @word and @words, you are also using $dictionaty instead of $dictionary. It is wise to

use strict;
use warnings;

at the start of your program and declare all of your variables using my. That way trivial bugs like this are fixed by Perl itself.

回复收藏 0 原文

~没有更多了~

关于作者

愿与i

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

统计词频然后排序

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

统计词频然后排序

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。