perl：字符串比较对的模式匹配

发布于 2024-10-24 17:24:09 字数 1539 浏览 1 评论 0原文

我有一个问题来计算每对字符串的大多数选定结果。我的代码：如果用户选择 sysA、sysB 或两者都不考虑这对字符串，似乎只是计数。我还遇到了一个问题，即进行多次比较并处理每对 7 个用户。

( $file = <INFILE> ) {
@field = parse_csv($file);
chomp(@field);
@query = $field[1];

for($i=0;$i<@query;++$i) {
    if ( ($field[2] eq $method) || ($field[3] eq $method)){
    if ( $field[4] eq $field[2]) {
    print "$query[$i]: $field[2], $field[3], $field[4]\n";
    $counta++;
    } 
    if ( $field[4] eq $field[3]) {
    print "$query[$i]: $field[2], $field[3]: $field[4]\n";
    $countb++;
    }
    if ( $field[4] eq ($field[2] && $field[3])) {
    #print "$query[$i]: $field[2]$field[3]\n";
    $countc++;

}

数据：对于每个查询，我有 3 种不同的字符串比较组合。

比较（“lucene-std-rel”，“lucene-noLen-rr”）；
比较（“lucene-noLen-rr”，“lucene-std-rel”）；
比较（“lucene-noLen-rr”，“随机”）；
比较（“随机”，“lucene-noLen-rr”）；
比较（“lucene-noLen-rr”，“lucene-nolen-rel”）；
比较（“lucene-nolen-rel”，“lucene-noLen-rr”）；

一对示例数据（7 个用户对每对进行评估）：

user1,male,lucene-std-rel,random,lucene-std-relrandom
- user2，男，lucene-std-rel，随机，lucene-std-rel
- user3，男，lucene-std-rel，随机，lucene-std-rel
- user4，男，lucene-std-rel，随机，lucene-std-rel
- user5，男，lucene-std-rel，随机，lucene-std-relrandom
- user6，男，lucene-std-rel，随机，lucene-std-rel
- user7，男，lucene-std-rel，随机，lucene-std-rel

示例输出需要：查询 1:男性健身模型

lucene-std-rel:5, random:0, Both:2 --->大多数：lucene-std-rel

非常感谢任何帮助。

原文

I have a problem to count the majority selectedresult for each pair of string. my code: it seems just count if user choose either sysA, sysB or both without considering the pair of string. I'm also have a problem to make multiple comparision and deal with 7 users for each pair.

( $file = <INFILE> ) {
@field = parse_csv($file);
chomp(@field);
@query = $field[1];

for($i=0;$i<@query;++$i) {
    if ( ($field[2] eq $method) || ($field[3] eq $method)){
    if ( $field[4] eq $field[2]) {
    print "$query[$i]: $field[2], $field[3], $field[4]\n";
    $counta++;
    } 
    if ( $field[4] eq $field[3]) {
    print "$query[$i]: $field[2], $field[3]: $field[4]\n";
    $countb++;
    }
    if ( $field[4] eq ($field[2] && $field[3])) {
    #print "$query[$i]: $field[2]$field[3]\n";
    $countc++;

}

data: for each query, i have 3 different combination of string comparision.

comparison("lucene-std-rel","lucene-noLen-rr");
comparison("lucene-noLen-rr","lucene-std-rel");
comparison("lucene-noLen-rr","random");
comparison( "random", "lucene-noLen-rr");
comparison("lucene-noLen-rr","lucene-nolen-rel");
Comparison("lucene-nolen-rel","lucene-noLen-rr");

example data for one pair (7 users evaluate for each pair):

user1,male,lucene-std-rel,random,lucene-std-relrandom
- user2,male,lucene-std-rel,random,lucene-std-rel
- user3,male,lucene-std-rel,random,lucene-std-rel
- user4,male,lucene-std-rel,random,lucene-std-rel
- user5,male,lucene-std-rel,random,lucene-std-relrandom
- user6,male,lucene-std-rel,random,lucene-std-rel
- user7,male,lucene-std-rel,random,lucene-std-rel

example output required: query 1:male fitness models

lucene-std-rel:5, random:0, both:2 ---> majority:lucene-std-rel

any help is very much appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

世俗缘 2024-10-31 17:24:09

好吧，在不让这比你要求的更复杂的情况下，这是我想出的一种可能的方法。

#!/usr/bin/perl
use strict;

my %counter = ( "A" => 0, "B" => 0, "AB" => 0, "majority" => 0);

while(<DATA>){

    chomp;
    next unless $_;
    my ($workerId,$query,$sys1,$sys2,$resultSelected) = split(',');

    $counter{$resultSelected}++;

}
$counter{'majority'} = (sort {$counter{$b} <=> $counter{$a}} keys %counter)[0];
print "A: $counter{'A'} B: $counter{'B'} both(AB): $counter{'AB'} majority: $counter{'majority'}\n";


__END__

user1,male,A,B,A

user2,male,A,B,AB

user3,male,A,B,B

user4,male,A,B,A

user5,male,A,B,A

其输出是：
A: 3 B: 1 两者(AB): 1 多数: A

我觉得我给你的例子并没有完全解决存在不止一种“多数”类型的想法。例如，如果 A 和 B 都是 9，我希望它们都列在那里。因为你没有问，所以我懒得这么做，但希望这能让你走上正确的道路。

Well, without making this more complex than you requested, here is what I came up with as a possible approach.

#!/usr/bin/perl
use strict;

my %counter = ( "A" => 0, "B" => 0, "AB" => 0, "majority" => 0);

while(<DATA>){

    chomp;
    next unless $_;
    my ($workerId,$query,$sys1,$sys2,$resultSelected) = split(',');

    $counter{$resultSelected}++;

}
$counter{'majority'} = (sort {$counter{$b} <=> $counter{$a}} keys %counter)[0];
print "A: $counter{'A'} B: $counter{'B'} both(AB): $counter{'AB'} majority: $counter{'majority'}\n";


__END__

user1,male,A,B,A

user2,male,A,B,AB

user3,male,A,B,B

user4,male,A,B,A

user5,male,A,B,A

The output of this is:
A: 3 B: 1 both(AB): 1 majority: A

I don't feel like my example to you fully addresses the idea of there being more than one type with the "majority". For instance, if both A and B are 9, I'd expect them both to be listed there. I didn't bother to do that since you didn't ask, but hopefully this will get you along the right path.

回复收藏 0 原文

百合的盛世恋 2024-10-31 17:24:09

open( INFILE, "compare.csv" ) or die("Can not open input file: $!");

while ( $file = <INFILE> ) {
@field = parse_csv($file);
chomp(@field);
@query = $field[1];

for($i=0;$i<@query;++$i) {
    if ( ($field[2] eq $method) || ($field[3] eq $method)){
    if ( $field[4] eq $field[2]) {
    print "$query[$i]: $field[2], $field[3], $field[4]\n";
    $counta++;
    } 
    if ( $field[4] eq $field[3]) {
    print "$query[$i]: $field[2], $field[3]: $field[4]\n";
    $countb++;
    }
    if ( $field[4] eq ($field[2] && $field[3])) {
    #print "$query[$i]: $field[2]$field[3]\n";
    $countc++;

}

}

}

子 parse_csv {
我的 $text = 转变；
我的@new = ();
推( @new, $+ ) 而 $text =~ m{
"([^\"\](?:\.[^\"\])*)",?
| ([^,]+),?
| ,
}gx;
Push(@new, undef) if substr($text, -1, 1) eq ',';
返回@new；
}`

open( INFILE, "compare.csv" ) or die("Can not open input file: $!");

while ( $file = <INFILE> ) {
@field = parse_csv($file);
chomp(@field);
@query = $field[1];

for($i=0;$i<@query;++$i) {
    if ( ($field[2] eq $method) || ($field[3] eq $method)){
    if ( $field[4] eq $field[2]) {
    print "$query[$i]: $field[2], $field[3], $field[4]\n";
    $counta++;
    } 
    if ( $field[4] eq $field[3]) {
    print "$query[$i]: $field[2], $field[3]: $field[4]\n";
    $countb++;
    }
    if ( $field[4] eq ($field[2] && $field[3])) {
    #print "$query[$i]: $field[2]$field[3]\n";
    $countc++;

}

}

}

sub parse_csv {
my $text = shift;
my @new = ();
push( @new, $+ ) while $text =~ m{
"([^\"\](?:\.[^\"\])*)",?
| ([^,]+),?
| ,
}gx;
push( @new, undef ) if substr( $text, -1, 1 ) eq ',';
return @new;
}`