如何在 Perl 中获取连续的单词对

发布于 2024-12-15 11:17:29 字数 357 浏览 0 评论 0原文

用这句话:

my $sent = "Mapping and quantifying mammalian transcriptomes RNA-Seq";

我们想要得到所有可能的连续的单词对。

my $var = ['Mapping and',
           'and quantifying',
           'quantifying mammalian',
           'mammalian transcriptomes',
           'transcriptomes RNA-Seq'];

有没有一种紧凑的方法来做到这一点?

With this sentence:

my $sent = "Mapping and quantifying mammalian transcriptomes RNA-Seq";

We want to get all possible consecutive pairs of words.

my $var = ['Mapping and',
           'and quantifying',
           'quantifying mammalian',
           'mammalian transcriptomes',
           'transcriptomes RNA-Seq'];

Is there a compact way to do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

情未る 2024-12-22 11:17:31

这是可行的:

my @sent = split(/\s+/, $sent);
my @var = map { $sent[$_] . ' ' . $sent[$_ + 1] } 0 .. $#sent - 1;

即只需将原始字符串拆分为单词数组,然后使用 map 迭代生成所需的对。

This works:

my @sent = split(/\s+/, $sent);
my @var = map { $sent[$_] . ' ' . $sent[$_ + 1] } 0 .. $#sent - 1;

i.e. just split the original string into an array of words, and then use map to iteratively produce the desired pairs.

忘年祭陌 2024-12-22 11:17:31

我没有将其作为一行,但以下代码应该为您提供开始的地方。基本上是通过 push 和带有 /g 的正则表达式来完成的。

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;
$Data::Dumper::Indent = 1;

my $t1 = 'aa bb cc dd ee ff';
my $t2 = 'aa bb cc dd ee';

foreach my $txt ( $t1, $t2 )
{
    my @a;
    push( @a, 
amp; ) while( $txt =~ /\G\S+(\s+\S+|)\s*/g );
    print Dumper( \@a );
}

感谢 @ysth 的语法,

 my @a = $txt =~ /\G(\S+(?:\s+\S+|))\s*/g;

我的正则表达式略有不同,因为如果您有奇数个单词,最后一个单词仍然会获得一个条目。

I don't have it as a single line, but the following code should give you somewhere to start. Basically does it with a push and a regext with /g.

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;
$Data::Dumper::Indent = 1;

my $t1 = 'aa bb cc dd ee ff';
my $t2 = 'aa bb cc dd ee';

foreach my $txt ( $t1, $t2 )
{
    my @a;
    push( @a, 
amp; ) while( $txt =~ /\G\S+(\s+\S+|)\s*/g );
    print Dumper( \@a );
}

One liner thanks to the syntax from @ysth

 my @a = $txt =~ /\G(\S+(?:\s+\S+|))\s*/g;

My regex is slightly different in that if you have an odd number of words, the last word still gets an entry.

感性不性感 2024-12-22 11:17:29

是的。

my $sent = "Mapping and quantifying mammalian transcriptomes RNA-Seq";
my @pairs = $sent =~ /(?=(\S+\s+\S+))\S+/g;

Yes.

my $sent = "Mapping and quantifying mammalian transcriptomes RNA-Seq";
my @pairs = $sent =~ /(?=(\S+\s+\S+))\S+/g;
擦肩而过的背影 2024-12-22 11:17:29

一种依赖于运算符评估顺序但不依赖于奇特的正则表达式或索引的变体(可能是不明智的):

my @words = split /\s+/, $sent;
my $last = shift @words;
my @var;
push @var, $last . ' ' . ($last = $_) for @words;

A variation that (perhaps unwisely) relies on operator evaluation order but doesn't rely on fancy regexes or indices:

my @words = split /\s+/, $sent;
my $last = shift @words;
my @var;
push @var, $last . ' ' . ($last = $_) for @words;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文