在 Perl 中比较 2 个字符串时出现问题

发布于 2025-01-08 02:21:08 字数 2336 浏览 0 评论 0原文

我想使用 Perl 计算两个文件之间存在的公共行数。

我有 1 个基本文件,用于比较 fileA 中是否存在所有行(由换行符 \n 分隔)。我所做的是将基本文件中的所有行放入 base_config 哈希中,并将 fileA 中的所有行放入配置哈希中。我想比较%config中的所有键,它也可以在%base_config的键中找到。 为了更有效地比较键,我对 %base_config 中的键进行了排序,并将它们放入 @sorted_base_config 中。

但是,对于某些具有完全相同的行但顺序不同的文件,我无法获得正确的计数。例如, 基本文件包含:

hello
hi
tired
sleepy

而 fileA 包含:

hi
tired
sleepy
hello

我能够从文件中读取值并将它们放入各自的哈希值和数组中。这是代码中出错的部分:

$count=0;
while(($key,$value)=each(%config))
{
    foreach (@sorted_base_config) 
    {
        print "config: $config{$key}\n";
                print "\$_: $_\n";
        if($config{$key} eq $_)
        {
            $count++;
        }
    }
}

有人可以告诉我我是否犯了任何错误吗?计数应该是 4,但它一直打印 2。

编辑: 这是我原来的代码,不起作用。它看起来很不同,因为我尝试使用不同的方法来解决问题。但是,我仍然陷入同样的​​问题。

#open base config file and load them into the base_config hash
open BASE_CONFIG_FILE, "< script/base.txt" or die;
my %base_config;
while (my $line=<BASE_CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $base_config{$word1} = $word1;
}
#sort BASE_CONFIG_FILE
@sorted_base_config = sort keys %base_config;

#open config file and load them into the config hash
open CONFIG_FILE, "< script/hello.txt" or die;
my %config;
while (my $line=<CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $config{$word1} = $word1;
}
#sort CONFIG_FILE
@sorted_config = sort keys %config;

%common={};
$count=0;
while(($key,$value)=each(%config))
{
    $num=keys(%base_config);
    $num--;#to get the correct index
    #print "$num\n";
    while($num>=0)
    {
        #check if all the strings in BASE_CONFIG_FILE can be found in CONFIG_FILE
        $common{$value}=$value if exists $base_config{$key};
        #print "yes!\n" if exists $base_config{$key};
        $num--;
    }
}
print "count: $count\n";

while(($key,$value)=each(%common))
{
    print "key: ".$key."\n";
    print "value: ".$value."\n";
}
$num=keys(%common)-1;
print "common lines: ".$num;

之前,我将 base_config 文件和 fileA 中存在的公用键推送到 %common 中。我想将来将公共密钥打印到一个txt文件中,并且在fileA中找到但在base_config文件中找不到的任何内容都将输出到另一个txt文件中。然而,我已经陷入了寻找通用密钥的初始阶段。

我使用“\n”分割成键进行存储,因此我无法使用 chomp 函数来删除“\n”。

编辑2: 我刚刚意识到我的代码有什么问题。在我的 txt 文件的末尾,我需要添加“\n”才能使其正常工作。感谢您的帮助! :D

I want to count the number of common lines that exist between 2 files using Perl.

I have 1 base file used to compare if all the lines (separated by a newline \n) exist in fileA. What I have done is to put all the lines from the base file into a base_config hash and the lines from fileA into config hash. I want to compare that for all the keys in the %config, it can also be found in the keys of %base_config.
To make it more efficient to compare the keys, I have sorted the keys in %base_config and put them into @sorted_base_config.

However, for some files that has exactly the same lines but in different order, I am not able to get the correct count. For example,
base file contains:

hello
hi
tired
sleepy

whereas fileA contains:

hi
tired
sleepy
hello

I am able to read in the values from the files and placed them into their respective hashes and arrays. Here is the part of the code that went wrong:

$count=0;
while(($key,$value)=each(%config))
{
    foreach (@sorted_base_config) 
    {
        print "config: $config{$key}\n";
                print "\$_: $_\n";
        if($config{$key} eq $_)
        {
            $count++;
        }
    }
}

Can someone please tell me if I have make any mistake? The count is suppose to be 4 but it keeps printing 2 all the time.

EDIT:
Here's my original code that didn't work. It looks quite different because I tried to use different methods to fix the problem. However, I am still stuck at the same problem.

#open base config file and load them into the base_config hash
open BASE_CONFIG_FILE, "< script/base.txt" or die;
my %base_config;
while (my $line=<BASE_CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $base_config{$word1} = $word1;
}
#sort BASE_CONFIG_FILE
@sorted_base_config = sort keys %base_config;

#open config file and load them into the config hash
open CONFIG_FILE, "< script/hello.txt" or die;
my %config;
while (my $line=<CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $config{$word1} = $word1;
}
#sort CONFIG_FILE
@sorted_config = sort keys %config;

%common={};
$count=0;
while(($key,$value)=each(%config))
{
    $num=keys(%base_config);
    $num--;#to get the correct index
    #print "$num\n";
    while($num>=0)
    {
        #check if all the strings in BASE_CONFIG_FILE can be found in CONFIG_FILE
        $common{$value}=$value if exists $base_config{$key};
        #print "yes!\n" if exists $base_config{$key};
        $num--;
    }
}
print "count: $count\n";

while(($key,$value)=each(%common))
{
    print "key: ".$key."\n";
    print "value: ".$value."\n";
}
$num=keys(%common)-1;
print "common lines: ".$num;

Previously, I push the common keys that exist in both base_config file and fileA into %common. I wanted to print out the common keys into a txt file in future and whatever that is found in fileA but not found in base_config file will be output to another txt file. However, I am already stuck at the initial phase of finding the common keys.

I am using "\n" to split into keys for storing so I can't use chomp function that will remove "\n".

EDIT 2:
I just realised what's wrong with my code. At the end of my txt files, I need to add "\n" to make it work. Thanks for all your help! :D

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夏了南城 2025-01-15 02:21:08

我认为你对效率的尝试实际上减慢了速度。

my %listA;

# Read first file (name in $NameA)
{
    open my $fileA, '<', "$NameA" or die $!;
    while (<$fileA>)
    {
        chomp;
        $listA{$_}++;
    }
}

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die $!;
    while (<$fileB>)
    {
        chomp;
        if ($listA{$_})
        {
            print "Line appears in $NameB once and $listA{$_} times in $NameA: $_\n";
        }
    }
}

如果您也想将第二个文件读入散列,那么这也可以:

现在,如果两个文件中都出现特定行,则会列出该行。请注意,即使我按排序顺序显示键,我也会使用哈希查找,因为这比在两个排序数组中进行混洗要快。当然,您很难测量 4 行文件的任何差异。对于大文件,读取文件和打印结果的 I/O 时间很可能会主导查找时间。

my %listB;

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die $!;
    while (<$fileB>)
    {
        chomp;
        $listB{$_}++;
    }
}

foreach my $key (sort keys %listA)
{
    if ($listB{$key})
    {
        print "$NameA: $listA{$key}; $NameB: $listB{$key}; $key\n";
    }
}

根据需要重新组织输出。

未经测试的代码!代码现已测试 - 见下文。


转换为测试代码

数据:FileA

hello
hi
tired
sleepy

数据:FileB

hi
tired
sleepy
hello

程序:ppp.pl

#!/usr/bin/env perl
use strict;
use warnings;

my $NameA = "fileA";
my $NameB = "fileB";

my %listA;

# Read first file (name in $NameA)
{
    open my $fileA, '<', "$NameA" or die "Failed to open $NameA: $!\n";
    while (<$fileA>)
    {
        chomp;
        $listA{$_}++;
    }
}

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die "Failed to open $NameB: $!\n";
    while (<$fileB>)
    {
        chomp;
        if ($listA{$_})
        {
            print "Line appears in $NameB once and $listA{$_} times in $NameA: $_\n";
        }
    }
}

输出

$ perl ppp.pl
Line appears in fileB once and 1 times in fileA: hi
Line appears in fileB once and 1 times in fileA: tired
Line appears in fileB once and 1 times in fileA: sleepy
Line appears in fileB once and 1 times in fileA: hello
$

请注意,这是按 fileB 的顺序列出内容,因为它应该考虑到循环读取 fileB 并依次检查每一行。

代码:qqq.pl

这是转变成完整工作程序的第二个片段。

#!/usr/bin/env perl
use strict;
use warnings;

my $NameA = "fileA";
my $NameB = "fileB";

my %listA;

# Read first file (name in $NameA)
{
    open my $fileA, '<', "$NameA" or die "Failed to open $NameA: $!\n";
    while (<$fileA>)
    {
        chomp;
        $listA{$_}++;
    }
}

my %listB;

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die "Failed to open $NameB: $!\n";
    while (<$fileB>)
    {
        chomp;
        $listB{$_}++;
    }
}

foreach my $key (sort keys %listA)
{
    if ($listB{$key})
    {
        print "$NameA: $listA{$key}; $NameB: $listB{$key}; $key\n";
    }
}

输出:

$ perl qqq.pl
fileA: 1; fileB: 1; hello
fileA: 1; fileB: 1; hi
fileA: 1; fileB: 1; sleepy
fileA: 1; fileB: 1; tired
$

请注意,键按排序顺序列出,这与 fileA 或 fileB 中的顺序不同。

小奇迹偶尔会发生!除了添加 5 行序言(shebang、2 x using、2 x my)之外,根据我对这两个程序的第一次计算,这两个程序片段的代码都工作正确。 (哦,我改进了无法打开文件的错误消息,至少确定了我无法打开哪个文件。ikegami 编辑了我的代码(谢谢!)以一致地添加 chomp 调用,并将换行符添加到现在需要显式换行符的 print 操作。)

我不会声明这一点是很棒的 Perl 代码;它肯定不会赢得(代码)高尔夫比赛。不过,它似乎确实有效。


对有问题的代码进行分析

open BASE_CONFIG_FILE, "< script/base.txt" or die;
my %base_config;
while (my $line=<BASE_CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $base_config{$word1} = $word1;
}

拆分很奇怪...您有一行以换行符结尾,并且您在换行符处拆分,因此 $word2 为空,而 $word1 code> 包含该行的其余部分。然后,您将值 $word1 (不是我乍一看假设的 $word2)存储到基本配置中。因此每个条目的键和值都是相同的。异常。实际上并没有错,但是……不寻常。第二个循环本质上是相同的(我们都应该因为没有使用单个子来为我们进行阅读而被枪杀)。

您不能使用 use strict;use warnings; - 请注意,实际上我对代码所做的第一件事就是添加它们。我用 Perl 编程只有大约 20 年,而且我知道我的知识还不足以冒险在没有 Perl 的情况下运行代码。您的排序数组,%common$count$num$key$value< /code> 不是我的。这次可能不会造成太大伤害,但是……这是一个坏兆头。始终,但始终,使用 use strict;使用警告;,直到您对 Perl 有足够的了解而无需提出有关它的问题(并且不要指望很快就会出现这种情况)。

当我运行它时,出现以下情况:

my %common={};  # line 32 - I added diagnostic printing
my $count=0;

Perl 告诉我:

Reference found where even-sized list expected at rrr.pl line 32, <CONFIG_FILE> line 4.

哎呀 - 那些 {} 应该是一个空列表 ()。了解为什么您在启用警告的情况下运行!

然后,

 50 while(my($key,$value)=each(%common))
 51 {
 52     print "key: ".$key."\n";
 53     print "value: ".$value."\n";
 54 }

Perl 告诉我:

key: HASH(0x100827720)
Use of uninitialized value $value in concatenation (.) or string at rrr.pl line 53, <CONFIG_FILE> line 4.

这是 %common 中的第一个条目,将东西扔进循环。


固定代码:rrr.pl

#!/usr/bin/env perl
use strict;
use warnings;

#open base config file and load them into the base_config hash
open BASE_CONFIG_FILE, "< fileA" or die;
my %base_config;
while (my $line=<BASE_CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $base_config{$word1} = $word1;
   print "w1 = <<$word1>>; w2 = <<$word2>>\n";
}

{ print "First file:\n"; foreach my $key (sort keys %base_config) { print "$key => $base_config{$key}\n"; } }

#sort BASE_CONFIG_FILE
my @sorted_base_config = sort keys %base_config;

#open config file and load them into the config hash
open CONFIG_FILE, "< fileB" or die;
my %config;
while (my $line=<CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $config{$word1} = $word1;
   print "w1 = <<$word1>>; w2 = <<$word2>>\n";
}
#sort CONFIG_FILE
my @sorted_config = sort keys %config;

{ print "Second file:\n"; foreach my $key (sort keys %base_config) { print "$key => $base_config{$key}\n"; } }

my %common=();
my $count=0;
while(my($key,$value)=each(%config))
{
    print "Loop: $key = $value\n";
    my $num=keys(%base_config);
    $num--;#to get the correct index
    #print "$num\n";
    while($num>=0)
    {
        #check if all the strings in BASE_CONFIG_FILE can be found in CONFIG_FILE
        $common{$value}=$value if exists $base_config{$key};
        #print "yes!\n" if exists $base_config{$key};
        $num--;
    }
}
print "count: $count\n";

while(my($key,$value)=each(%common))
{
    print "key: $key -- value: $value\n";
}
my $num=keys(%common);
print "common lines: $num\n";

输出:

$ perl rrr.pl
w1 = <<hello>>; w2 = <<>>
w1 = <<hi>>; w2 = <<>>
w1 = <<tired>>; w2 = <<>>
w1 = <<sleepy>>; w2 = <<>>
First file:
hello => hello
hi => hi
sleepy => sleepy
tired => tired
w1 = <<hi>>; w2 = <<>>
w1 = <<tired>>; w2 = <<>>
w1 = <<sleepy>>; w2 = <<>>
w1 = <<hello>>; w2 = <<>>
Second file:
hello => hello
hi => hi
sleepy => sleepy
tired => tired
Loop: hi = hi
Loop: hello = hello
Loop: tired = tired
Loop: sleepy = sleepy
count: 0
key: hi -- value: hi
key: tired -- value: tired
key: hello -- value: hello
key: sleepy -- value: sleepy
common lines: 4
$

I think your attempt at efficiency is actually slowing things down.

my %listA;

# Read first file (name in $NameA)
{
    open my $fileA, '<', "$NameA" or die $!;
    while (<$fileA>)
    {
        chomp;
        $listA{$_}++;
    }
}

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die $!;
    while (<$fileB>)
    {
        chomp;
        if ($listA{$_})
        {
            print "Line appears in $NameB once and $listA{$_} times in $NameA: $_\n";
        }
    }
}

If you want to read the second file into a hash too, then that also works:

Now, if a particular line appears in both files, it will be listed. Note that even though I present the keys in sorted order, I'm using the hash lookup because that will be quicker that shuffling through two sorted arrays. You'd be hard-pressed to measure any difference on 4-line files, of course. And with large files, the chances are that the I/O time reading the files and printing the results will dominate the lookup time.

my %listB;

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die $!;
    while (<$fileB>)
    {
        chomp;
        $listB{$_}++;
    }
}

foreach my $key (sort keys %listA)
{
    if ($listB{$key})
    {
        print "$NameA: $listA{$key}; $NameB: $listB{$key}; $key\n";
    }
}

Reorganize the output as you wish.

Untested code! Code now tested - see below.


Converted to test code

Data: FileA

hello
hi
tired
sleepy

Data: FileB

hi
tired
sleepy
hello

Program: ppp.pl

#!/usr/bin/env perl
use strict;
use warnings;

my $NameA = "fileA";
my $NameB = "fileB";

my %listA;

# Read first file (name in $NameA)
{
    open my $fileA, '<', "$NameA" or die "Failed to open $NameA: $!\n";
    while (<$fileA>)
    {
        chomp;
        $listA{$_}++;
    }
}

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die "Failed to open $NameB: $!\n";
    while (<$fileB>)
    {
        chomp;
        if ($listA{$_})
        {
            print "Line appears in $NameB once and $listA{$_} times in $NameA: $_\n";
        }
    }
}

Output

$ perl ppp.pl
Line appears in fileB once and 1 times in fileA: hi
Line appears in fileB once and 1 times in fileA: tired
Line appears in fileB once and 1 times in fileA: sleepy
Line appears in fileB once and 1 times in fileA: hello
$

Note that this is listing things in the order of fileB, as it should given that the loop reads through fileB and checks each line in turn.

Code: qqq.pl

This is the second fragment turned into a complete working program.

#!/usr/bin/env perl
use strict;
use warnings;

my $NameA = "fileA";
my $NameB = "fileB";

my %listA;

# Read first file (name in $NameA)
{
    open my $fileA, '<', "$NameA" or die "Failed to open $NameA: $!\n";
    while (<$fileA>)
    {
        chomp;
        $listA{$_}++;
    }
}

my %listB;

# Read second file (name in $NameB)
{
    open my $fileB, '<', "$NameB" or die "Failed to open $NameB: $!\n";
    while (<$fileB>)
    {
        chomp;
        $listB{$_}++;
    }
}

foreach my $key (sort keys %listA)
{
    if ($listB{$key})
    {
        print "$NameA: $listA{$key}; $NameB: $listB{$key}; $key\n";
    }
}

Output:

$ perl qqq.pl
fileA: 1; fileB: 1; hello
fileA: 1; fileB: 1; hi
fileA: 1; fileB: 1; sleepy
fileA: 1; fileB: 1; tired
$

Note that the keys are listed in sorted order, which is not the order in either fileA or fileB.

Minor miracles occasionally happen! Apart from adding the 5 lines of preamble (shebang, 2 x using, 2 x my), the code for both program fragments worked correct according to my reckoning first time for both programs. (Oh, and I improved the error messages on failing to open the file, at least identifying which file I failed to open. And ikegami edited my code (thanks!) to add the chomp calls consistently, and the newlines to the print operations which now need the explicit newline.)

I would not claim this is great Perl code; it certainly won't win a (code) golfing contest. It does seem to work, though.


Analysis of Code in Question

open BASE_CONFIG_FILE, "< script/base.txt" or die;
my %base_config;
while (my $line=<BASE_CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $base_config{$word1} = $word1;
}

The split is odd...you have a line that ends with a newline, and you split at the newline, so $word2 is empty, and $word1 contains the rest of the line. You then store the value $word1 (not $word2 as I assumed at first glance) into the base configuration. So the key and the value are the same for each entry. Unusual. Not actually wrong, but ... unusual. The second loop is essentially the same (we should both be shot for not using a single sub to do the reading for us).

You can't be using use strict; and use warnings; - note that the practically the first thing I did with my code was add them. I've only been programming in Perl for about 20 years, and I know I don't know enough to risk running code without them. Your sorted arrays, %common, $count, $num, $key, $value are not my'd. It probably doesn't do much harm this time, but...it is a bad sign. Always, but always, use use strict; use warnings; until you know enough about Perl not to need to ask questions about it (and don't expect that to be any time soon).

When I run it, at the point where there is:

my %common={};  # line 32 - I added diagnostic printing
my $count=0;

Perl tells me:

Reference found where even-sized list expected at rrr.pl line 32, <CONFIG_FILE> line 4.

Oops - those {} should be an empty list (). See why you run with warnings enabled!

And then, at

 50 while(my($key,$value)=each(%common))
 51 {
 52     print "key: ".$key."\n";
 53     print "value: ".$value."\n";
 54 }

Perl tells me:

key: HASH(0x100827720)
Use of uninitialized value $value in concatenation (.) or string at rrr.pl line 53, <CONFIG_FILE> line 4.

That's the first entry in %common throwing things for a loop.


Fixed code: rrr.pl

#!/usr/bin/env perl
use strict;
use warnings;

#open base config file and load them into the base_config hash
open BASE_CONFIG_FILE, "< fileA" or die;
my %base_config;
while (my $line=<BASE_CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $base_config{$word1} = $word1;
   print "w1 = <<$word1>>; w2 = <<$word2>>\n";
}

{ print "First file:\n"; foreach my $key (sort keys %base_config) { print "$key => $base_config{$key}\n"; } }

#sort BASE_CONFIG_FILE
my @sorted_base_config = sort keys %base_config;

#open config file and load them into the config hash
open CONFIG_FILE, "< fileB" or die;
my %config;
while (my $line=<CONFIG_FILE>) {
   (my $word1,my $word2) = split /\n/, $line;
   $config{$word1} = $word1;
   print "w1 = <<$word1>>; w2 = <<$word2>>\n";
}
#sort CONFIG_FILE
my @sorted_config = sort keys %config;

{ print "Second file:\n"; foreach my $key (sort keys %base_config) { print "$key => $base_config{$key}\n"; } }

my %common=();
my $count=0;
while(my($key,$value)=each(%config))
{
    print "Loop: $key = $value\n";
    my $num=keys(%base_config);
    $num--;#to get the correct index
    #print "$num\n";
    while($num>=0)
    {
        #check if all the strings in BASE_CONFIG_FILE can be found in CONFIG_FILE
        $common{$value}=$value if exists $base_config{$key};
        #print "yes!\n" if exists $base_config{$key};
        $num--;
    }
}
print "count: $count\n";

while(my($key,$value)=each(%common))
{
    print "key: $key -- value: $value\n";
}
my $num=keys(%common);
print "common lines: $num\n";

Output:

$ perl rrr.pl
w1 = <<hello>>; w2 = <<>>
w1 = <<hi>>; w2 = <<>>
w1 = <<tired>>; w2 = <<>>
w1 = <<sleepy>>; w2 = <<>>
First file:
hello => hello
hi => hi
sleepy => sleepy
tired => tired
w1 = <<hi>>; w2 = <<>>
w1 = <<tired>>; w2 = <<>>
w1 = <<sleepy>>; w2 = <<>>
w1 = <<hello>>; w2 = <<>>
Second file:
hello => hello
hi => hi
sleepy => sleepy
tired => tired
Loop: hi = hi
Loop: hello = hello
Loop: tired = tired
Loop: sleepy = sleepy
count: 0
key: hi -- value: hi
key: tired -- value: tired
key: hello -- value: hello
key: sleepy -- value: sleepy
common lines: 4
$
妄断弥空 2025-01-15 02:21:08

也许这不是您正在寻找的方法,但是如果您更像这样:

#!/usr/bin/perl
use Data::Dumper;
use warnings;
use strict;

my @sorted_base_config = qw(hello hi tired sleepy);
my @file_a = qw(hi tired sleepy hello);
my @found_in_both = ();

foreach (@sorted_base_config) {
  if (grep /$_/, @file_a) {
    push(@found_in_both, $_);
  }
}

print "These items were found in file_a:\n";
print Dumper(@found_in_both);

基本上,而不是执行键/值哈希操作...为什么不尝试使用两个数组并使用 foreach 为基本文件数组。当您浏览 @sorted_base_config 的每一行时,您会检查是否可以在 @file_a 中找到该字符串。

如何将文件放入 @sorted_base_config@file_a 数组(以及如何处理换行符或换行符)取决于您。这样,至少,它似乎可以更准确地检查哪些单词匹配。

Maybe it's not the approach you are looking for, but what if you went about it more like this:

#!/usr/bin/perl
use Data::Dumper;
use warnings;
use strict;

my @sorted_base_config = qw(hello hi tired sleepy);
my @file_a = qw(hi tired sleepy hello);
my @found_in_both = ();

foreach (@sorted_base_config) {
  if (grep /$_/, @file_a) {
    push(@found_in_both, $_);
  }
}

print "These items were found in file_a:\n";
print Dumper(@found_in_both);

Basically, instead of doing the key/value hash thing... why not try using two arrays and using foreach for the base file array. As you go through each line of @sorted_base_config you check to see if the string can be found in @file_a.

It's up to you as to how you want to get the files into the @sorted_base_config and @file_a arrays (and how to deal with newlines or line breaks.) But with this way, at least, it seems to get a more accurate check of what words match.

永言不败 2025-01-15 02:21:08

如果没有看到您如何定义和填充 %config 和 @sorted_base_config 变量,我不确定是什么导致您的代码失败。如果您提供运行上面代码的输出,那就更明显了。

我没有像其他答案那样提供全新的方法,而是尝试“修复”你的方法,但我的方法没有任何问题。这意味着错误实际上在于您填充变量的方式,而不是您检查的方式。

为了简单地匹配您的代码,我将键和值指定为从文件中读取的内容。

此代码:

#!C:\Perl\bin\perl
use strict;
use warnings;

my $f1 = $ARGV[0];
my $f2 = $ARGV[1];
my %config_base;
my %config;
my $line;
print "F1 = $f1\nF2 = $f2\n";

open F1, '<', $f1 || die;
while ($line = <F1>) {
chomp $line;
print "adding $line\n";
$config_base{$line}=$line;
}
close F1;
open F2, '<', $f2 || die;
while ($line = <F2>) {
chomp $line;
print "adding $line\n";
$config{$line}=$line;
}
close F2;
my $count=0;
my $key; my $value;
my @sorted_base_config = sort keys %config_base;
while(($key,$value)=each(%config))
{
    foreach (@sorted_base_config) 
    {
        print "config: $config{$key}\n";
                print "\$_: $_\n";
        if($config{$key} eq $_)
        {
            $count++;
        }
    }
}
print "Count = $count\n";

输出结果:

F1 = config_base.txt
F2 = config.txt
adding hello
adding hi
adding tired
adding sleepy
adding hi
adding tired
adding sleepy
adding hello
config: hi
$_: hello
config: hi
$_: hi
config: hi
$_: sleepy
config: hi
$_: tired
config: hello
$_: hello
config: hello
$_: hi
config: hello
$_: sleepy
config: hello
$_: tired
config: tired
$_: hello
config: tired
$_: hi
config: tired
$_: sleepy
config: tired
$_: tired
config: sleepy
$_: hello
config: sleepy
$_: hi
config: sleepy
$_: sleepy
config: sleepy
$_: tired
Count = 4

但是,Johnathan 的答案是比您开始时更好的方法。至少,使用存在来比较 2 个输入哈希的键比针对键数组的嵌套循环要好得多。该循环一开始就降低了使用哈希的效率。

在这种情况下,你会得到类似的东西:

foreach my $key (keys %config_base) 
    {
        print "config: $config{$key}\n";
                print "\$_: $key\n";
        if(exists $config{$key})
        {
            $count++;
        }
    }
print "Count = $count\n";

Without seeing how you defined and populated the %config and @sorted_base_config variables I'm not sure what is causing your code to fail. If you provide the output of running the code you have above it would be more obvious.

Rather than providing a whole new approach as in the other answers, I tried "fixing" yours, but mine works with no issues. That would imply that the error is actually in how you populated the variables, rather than in how you are checking.

For simplicity in matching your code, I assigned both the key and the value to be what was read from the file.

This code:

#!C:\Perl\bin\perl
use strict;
use warnings;

my $f1 = $ARGV[0];
my $f2 = $ARGV[1];
my %config_base;
my %config;
my $line;
print "F1 = $f1\nF2 = $f2\n";

open F1, '<', $f1 || die;
while ($line = <F1>) {
chomp $line;
print "adding $line\n";
$config_base{$line}=$line;
}
close F1;
open F2, '<', $f2 || die;
while ($line = <F2>) {
chomp $line;
print "adding $line\n";
$config{$line}=$line;
}
close F2;
my $count=0;
my $key; my $value;
my @sorted_base_config = sort keys %config_base;
while(($key,$value)=each(%config))
{
    foreach (@sorted_base_config) 
    {
        print "config: $config{$key}\n";
                print "\$_: $_\n";
        if($config{$key} eq $_)
        {
            $count++;
        }
    }
}
print "Count = $count\n";

Results in the output:

F1 = config_base.txt
F2 = config.txt
adding hello
adding hi
adding tired
adding sleepy
adding hi
adding tired
adding sleepy
adding hello
config: hi
$_: hello
config: hi
$_: hi
config: hi
$_: sleepy
config: hi
$_: tired
config: hello
$_: hello
config: hello
$_: hi
config: hello
$_: sleepy
config: hello
$_: tired
config: tired
$_: hello
config: tired
$_: hi
config: tired
$_: sleepy
config: tired
$_: tired
config: sleepy
$_: hello
config: sleepy
$_: hi
config: sleepy
$_: sleepy
config: sleepy
$_: tired
Count = 4

However, Johnathan's answer is a better approach than what you started with. At the very least, using exists to compare the keys of the 2 input hashes is far better than a nested loop against an array of keys. The loop defeats the efficiency of using a hash to begin with.

In that case, you would have something like:

foreach my $key (keys %config_base) 
    {
        print "config: $config{$key}\n";
                print "\$_: $key\n";
        if(exists $config{$key})
        {
            $count++;
        }
    }
print "Count = $count\n";
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文