如何将正则表达式字符串替换值($1、$2 等)映射到哈希?

发布于 2024-12-18 19:10:28 字数 990 浏览 0 评论 0原文

my (@keys,@values) = ($text =~ /\{IS\:([a-zA-Z0-9_-]+)\}(.*)\{\\IS\:([a-zA-Z0-9_-]+)\}/g);

应该匹配这样的字符串

{IS:cow}moo{\IS:cow}
{IS:cow}moo{\IS:cow}    
{IS:dog}bark{\IS:dog}
{IS:dog}meow{\IS:dog} #probably not a dog

,它工作得很好,除了所有 $1、$2 和 $3 值都被转储到 @keys .. 所以我试图弄清楚如何让这些家伙进入 $1 => 的漂亮哈希; $2 对...

对于完整的上下文,我真正想要做的是让正则表达式返回一个看起来像的数据结构(并附加一个计数找到密钥的次数)

{ 
  cow_1 => moo,
  cow_2 => moo,
  dog_1 => bark,
  dog_2 => meow,
}

有没有办法使用 map{ } 函数通过正则表达式来完成此操作?也许是这样的?

my %datahash = map { ( $1 eq $3 ) ? { $1 => $2 } : undef } @{ regex...};

$1 等于 $3 以确保其是匹配的标签(无需递归检查这些标签是否嵌套),如果是,则使用 $1 作为键,$2 作为值;

然后对于每个键 =>值对,我想替换

{IS:cow}moo{\IS:cow}
{IS:cow}moo{\IS:cow}   

{cow_1}
{cow_2}

如果 $cachedData{cow} 为 true,则所有的ow_*将被替换为%datahash中的密钥...

my (@keys,@values) = ($text =~ /\{IS\:([a-zA-Z0-9_-]+)\}(.*)\{\\IS\:([a-zA-Z0-9_-]+)\}/g);

is supposed to match strings like this

{IS:cow}moo{\IS:cow}
{IS:cow}moo{\IS:cow}    
{IS:dog}bark{\IS:dog}
{IS:dog}meow{\IS:dog} #probably not a dog

which works fine, except that all the $1,$2, and $3 value get dumped into @keys .. so I'm trying to figure out how to get these guys into a nice hash of $1 => $2 pairs...

For full context what I'd really like to do however is have the regex expression return a data structure that looks like (and append a count for the number of times the key was found)

{ 
  cow_1 => moo,
  cow_2 => moo,
  dog_1 => bark,
  dog_2 => meow,
}

Is there a way to use map{ } function to accomplish this with Regex? Something like this maybe?

my %datahash = map { ( $1 eq $3 ) ? { $1 => $2 } : undef } @{ regex...};

$1 equals $3 to make sure its a matching tag (no need for recursive checking these tags aren't nested), if so use $1 as the key and $2 as the value;

Then for each of these key => value pairs, i want to replace

{IS:cow}moo{\IS:cow}
{IS:cow}moo{\IS:cow}   

with

{cow_1}
{cow_2}

then if $cachedData{cow} is true all cow_* will be replaced with their key in %datahash...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

攀登最高峰 2024-12-25 19:10:28
$hash{$1} = $2 while 
        $text =~ /\{IS\:([a-zA-Z0-9_-]+)\}
                           (.*)
                  \{\\IS\:([a-zA-Z0-9_-]+)\}/gx;

(添加 /x 修饰符以提高可读性)

$hash{$1} = $2 while 
        $text =~ /\{IS\:([a-zA-Z0-9_-]+)\}
                           (.*)
                  \{\\IS\:([a-zA-Z0-9_-]+)\}/gx;

(/x modifier added for readability)

枫以 2024-12-25 19:10:28

我从正则表达式中删除了无用的反斜杠和括号,并在 char 类中使用了快捷方式:

#!/usr/bin/perl
use warnings;
use strict;

my $text = '{IS:cow}moo{\IS:cow}
{IS:cow}moo{\IS:cow}    
{IS:dog}bark{\IS:dog}
{IS:dog}meow{\IS:dog}';

my %cnt;
my %animals;
while ( $text =~ /\{IS:([\w-]+)}(.*)\{\\IS:[\w-]+}/g ){
    $animals{$1 . '_' . ++$cnt{$1}} = $2;
}

print "$_ => $animals{$_}\n" for sort keys %animals;

I removed useless backslashes and parens from the regex and used shortcuts in the char class:

#!/usr/bin/perl
use warnings;
use strict;

my $text = '{IS:cow}moo{\IS:cow}
{IS:cow}moo{\IS:cow}    
{IS:dog}bark{\IS:dog}
{IS:dog}meow{\IS:dog}';

my %cnt;
my %animals;
while ( $text =~ /\{IS:([\w-]+)}(.*)\{\\IS:[\w-]+}/g ){
    $animals{$1 . '_' . ++$cnt{$1}} = $2;
}

print "$_ => $animals{$_}\n" for sort keys %animals;
攒眉千度 2024-12-25 19:10:28
  • $dataHash{cow}[$num] 完全等同于 $dataHash{"cow_$num"}
  • 使用 $dataHash{cow} 更容易获得任何牛的东西以及,反对
    使用

    @dataHash{ grep { m/^cow_/ } keys %dataHash }

    “扫描”键

    • 它还将源数据(“牛”)与合成数据分开(“1”,因为这是我第一次看到这种情况。)

所以,我认为现在是引入 multi_hash 开始发挥作用。

sub multi_hash {
    use List::Pairwise qw<mapp>;
    my %h;
    mapp { push @{ $h{ $a } }, $b } @_;
    return wantarray ? %h : \%h;
}

使用这个习惯用法,您可以创建一个散列,类似于您想要的那样:

my %dataHash 
    = multi_hash(  map { m/[{]IS:([\w-]+)[}]([^{]+)[{]\\IS:\1[}]/ } @lines )
    ;

这给了我:

%dataHash: {
             cow => [
                      'moo',
                      'moo'
                    ],
             dog => [
                      'bark',
                      'meow'
                    ]
           }
  • $dataHash{cow}[$num] is exactly equivalent to $dataHash{"cow_$num"}
  • It's easier to get anything that's a cow with $dataHash{cow} as well as, opposed to
    "scanning" the keys with

    @dataHash{ grep { m/^cow_/ } keys %dataHash }
    • It also keeps the source data ('cow') separate from the synthetic data ( '1' as this is the first time I've seen this.)

So, I thought it was a good time to bring multi_hash into play.

sub multi_hash {
    use List::Pairwise qw<mapp>;
    my %h;
    mapp { push @{ $h{ $a } }, $b } @_;
    return wantarray ? %h : \%h;
}

With that idiom, you can make a hash, similar to what you want like so:

my %dataHash 
    = multi_hash(  map { m/[{]IS:([\w-]+)[}]([^{]+)[{]\\IS:\1[}]/ } @lines )
    ;

This gives me:

%dataHash: {
             cow => [
                      'moo',
                      'moo'
                    ],
             dog => [
                      'bark',
                      'meow'
                    ]
           }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文