当我在 Perl 的匹配运算符中插入变量时,如何转义元字符?
假设我有一个文件,其中包含我要匹配的行:
foo
quux
bar
在我的代码中,我有另一个数组:
foo
baz
quux
假设我们迭代该文件,调用每个元素 $word
,以及我们的内部列表正在检查 @arr
。
if( grep {$_ =~ m/^$word$/i} @arr)
这可以正常工作,但在某种可能的情况下,我们在文件中有 fo.
测试用例,.
在正则表达式中充当通配符运算符,并且 < code>fo. 然后匹配 foo
,这是不可接受的。
这当然是因为 Perl 正在将变量插入到正则表达式中。
问题:
如何强制 Perl 按字面意思使用变量?
Suppose I have a file containing lines I'm trying to match against:
foo
quux
bar
In my code, I have another array:
foo
baz
quux
Let's say we iterate through the file, calling each element $word
, and the internal list we are checking against, @arr
.
if( grep {$_ =~ m/^$word$/i} @arr)
This works correctly, but in the somewhat possible case where we have an test case of fo.
in the file, the .
operates as a wildcard operator in the regex, and fo.
then matches foo
, which is not acceptable.
This is of course because Perl is interpolating the variable into a regex.
The question:
How do I force Perl to use the variable literally?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
在变量值插值后,使用
\Q...\E
直接在 Perl 字符串中转义特殊符号:Use
\Q...\E
to escape special symbols directly in perl string after variable value interpolation:来自 perlfaq6 的回答 如何匹配变量中的正则表达式?:
我们不必将模式硬编码到匹配运算符中(或任何其他与正则表达式)。我们可以将模式放入变量中以供以后使用。
匹配运算符是双引号上下文,因此您可以像双引号字符串一样插入变量。在本例中,您读取正则表达式作为用户输入并将其存储在 $regex 中。一旦您在 $regex 中获得了模式,就可以在匹配运算符中使用该变量。
$regex 中的任何正则表达式特殊字符仍然是特殊的,并且模式仍然必须有效,否则 Perl 会抱怨。例如,在此模式中有一个不成对的括号。
当 Perl 编译正则表达式时,它将括号视为内存匹配的开始。当它找不到右括号时,它会抱怨:
根据我们的情况,您可以通过多种方式解决这个问题。首先,如果您不希望字符串中的任何字符特殊,可以在使用字符串之前使用 quotemeta 对它们进行转义。
您还可以使用 \Q 和 \E 序列直接在匹配运算符中执行此操作。 \Q 告诉 Perl 从哪里开始转义特殊字符,\E 告诉 Perl 在哪里停止(更多细节请参见 perlop)。
或者,您可以使用 qr//,即正则表达式引号运算符(有关更多详细信息,请参阅 perlop)。它引用并可能编译该模式,并且您可以将正则表达式标志应用于该模式。
您可能还想通过在整个事情周围包裹一个 eval 块来捕获任何错误。
或者...
From perlfaq6's answer to How do I match a regular expression that's in a variable?:
We don't have to hard-code patterns into the match operator (or anything else that works with regular expressions). We can put the pattern in a variable for later use.
The match operator is a double quote context, so you can interpolate your variable just like a double quoted string. In this case, you read the regular expression as user input and store it in $regex. Once you have the pattern in $regex, you use that variable in the match operator.
Any regular expression special characters in $regex are still special, and the pattern still has to be valid or Perl will complain. For instance, in this pattern there is an unpaired parenthesis.
When Perl compiles the regular expression, it treats the parenthesis as the start of a memory match. When it doesn't find the closing parenthesis, it complains:
You can get around this in several ways depending on our situation. First, if you don't want any of the characters in the string to be special, you can escape them with quotemeta before you use the string.
You can also do this directly in the match operator using the \Q and \E sequences. The \Q tells Perl where to start escaping special characters, and the \E tells it where to stop (see perlop for more details).
Alternately, you can use qr//, the regular expression quote operator (see perlop for more details). It quotes and perhaps compiles the pattern, and you can apply regular expression flags to the pattern.
You might also want to trap any errors by wrapping an eval block around the whole thing.
Or...
正确的答案是 - 不要使用正则表达式。我并不是说正则表达式不好,但使用它们进行(等于)简单的相等检查就有点矫枉过正了。
使用:
grep { lc($_) eq lc($word) } @arr
并感到高兴。The correct answer is - don't use regexps. I'm not saying regexps are bad, but using them for (what equals to) simple equality check is overkill.
Use:
grep { lc($_) eq lc($word) } @arr
and be happy.Quotemeta
返回 EXPR 的值,其中所有非“单词”字符都带有反斜杠。
http://perldoc.perl.org/functions/quotemeta.html
Quotemeta
Returns the value of EXPR with all non-"word" characters backslashed.
http://perldoc.perl.org/functions/quotemeta.html
我认为在这种情况下您不需要正则表达式,因为您没有匹配模式。您正在寻找您已经知道的字符的字面序列。使用要匹配的值构建一个哈希,并使用它来过滤
@arr
:I don't think you want a regex in this case since you aren't matching a pattern. You're looking for a literal sequence of characters that you already know. Build a hash with the values to match and use that to filter
@arr
: