为什么我的非贪婪 Perl 正则表达式没有匹配任何内容?
我以为我对 Perl RE 的理解达到了合理的程度,但这让我感到困惑:
#!/usr/bin/perl
use strict;
use warnings;
my $test = "'some random string'";
if($test =~ /\'?(.*?)\'?/) {
print "Captured $1\n";
print "Matched $&";
}
else {
print "What?!!";
}
打印
被捕获
匹配'
看起来它已经匹配了单独的结尾”,因此什么也没捕获。
我本希望它能够匹配整个事情,或者如果它完全不贪婪,则什么都不匹配(因为一切都有一个可选的匹配)。
这种中间行为让我感到困惑,有人能解释发生了什么吗?
I thought I understood Perl RE to a reasonable extent, but this is puzzling me:
#!/usr/bin/perl
use strict;
use warnings;
my $test = "'some random string'";
if($test =~ /\'?(.*?)\'?/) {
print "Captured $1\n";
print "Matched amp;";
}
else {
print "What?!!";
}
prints
Captured
Matched '
It seems it has matched the ending ' alone, and so captured nothing.
I would have expected it to match the entire thing, or if it's totally non-greedy, nothing at all (as everything there is an optional match).
This in between behaviour baffles me, can anyone explain what is happening?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
开头和结尾的
\'?
表示贪婪地匹配0或1个撇号。 (正如另一位发帖者指出的,为了使其非贪婪,它必须是\'??
)中间的
.*?
表示 非贪婪地匹配 0 个或多个字符。Perl 正则表达式引擎将查看字符串的第一部分。 它将匹配开头,但这样做非常贪婪,因此它选择了第一个撇号。 然后它会非贪婪地匹配(因此需要尽可能少的时间),后跟一个可选的撇号。 这与空字符串匹配。
The
\'?
at the beginning and end means match 0 or 1 apostrophes greedily. (As another poster has pointed out, to make it non-greedy, it would have to be\'??
)The
.*?
in the middle means match 0 or more characters non-greedily.The Perl regular expression engine will look at the first part of the string. It will match the beginning, but does so greedily, so it picks up the first apostrophe. It then matches non-greedily (so takes as little as it can) followed by an optional apostrophe. This is matched by the empty string.
我认为你的意思是这样的:
或者
单引号不需要转义,据我所知。
I think you mean something like:
or
The singe quotes don't need to be escaped, AFAIK.
pattern?
是贪婪的,如果你希望它是非贪婪的,你必须说pattern??
:from perldoc perlre:
pattern?
is greedy, if you want it to be non-greedy you must saypattern??
:from perldoc perlre:
请注意不要将正则表达式的所有元素设置为可选(即使用 * 或 ? 量化所有元素)。 这使得 Perl 正则表达式引擎可以根据需要进行匹配(甚至什么都不匹配),同时仍然认为匹配成功。
我怀疑你想要的是
Beware of making all elements of your regex optional (i.e. having all elements quantified with * or ? ). This lets the Perl regex engine match as much as it wants (even nothing), while still considering the match successful.
I suspect what you want is
我想说,最接近您正在寻找的答案是
“获取单引号(如果存在)”、“获取任何非单引号的内容”、“获取最后一个单引号(如果存在)”。
除非你想匹配“'不要这样做'” - 但无论如何谁在单引号中使用撇号(并且长期使用它)? :)
I would say the closest answer to what you are looking for is
So "get the single quote if it's there", "get anything and everything that's not a single quote", "get the last single quote if it's there".
Unless you want to match "'don't do this'" - but who uses an apostrophe in a single quote anyway (and gets away with it for long)? :)