Ruby 1.9 正则表达式 Lookbehind 断言和锚
Ruby 1.9 正则表达式支持后向断言,但在模式中传递锚点时我似乎遇到困难。当锚点在前瞻断言中传递时,它运行得很好。
"well substring! "[/(?<=^|\A|\s|\b)substring!(?=$|\Z|\s|\b)/] #=> RegexpError: invalid pattern in look-behind: /(?<=^|\A|\s|\b)substring(?=$|\Z|\s|\b)/
有谁知道如何使锚点在后视断言中像在前视断言中一样工作?
向后查找是否需要特殊的转义序列或分组?
我已在 1.9.1-p243、p376 和 1.9.2-preview3 中测试了此行为,以防万一它被修补。
Ruby 1.9 regex supports lookbehind assertion but I seem to have difficulty when passing anchors in the pattern. When anchors are passed in the lookahead assertion it runs just fine.
"well substring! "[/(?<=^|\A|\s|\b)substring!(?=$|\Z|\s|\b)/] #=> RegexpError: invalid pattern in look-behind: /(?<=^|\A|\s|\b)substring(?=$|\Z|\s|\b)/
Does anybody know how to make anchors work in lookbehind assertions as it does in lookahead?
Is there a special escape sequence or grouping that is required for lookbehind?
I have tested this behavior in 1.9.1-p243, p376 and 1.9.2-preview3 just in case it was patched.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
看起来你是对的:
\b
在前瞻中按预期工作,但在后瞻中它被视为语法错误。在这种情况下,这并不重要:如果
(?<=^|\A|\s|\b)
会产生所需的结果,\b
无论如何,这就是您所需要的。断言后面的字符必须是s
——一个单词字符——所以\b
意味着 (1) 前一个字符不是一个单词字符,或者 (2) 没有前面的字符。既然如此,^
、\A
和\s
都是多余的。但是,如果字符串以
!
开头,那就是另一回事了。^
和\A
仍然匹配字符串的开头,在!
之前,但\b
匹配其之后。如果你想将!substring!
作为一个完整的字符串进行匹配,你必须使用/\A!substring!\Z/
,但如果你只想匹配整个单词substring
您必须使用/\bsubstring\b/
。至于
[^\B]
,它只匹配除B
之外的任何字符。与\b
一样,\B
是一种零宽度断言,并且字符类必须精确匹配一个字符。某些正则表达式风格会因无效的转义序列\B
引发异常,但 Ruby(或 Oniguruma,更有可能)让它滑动。Looks like you're right:
\b
works as expected in a lookahead, but in a lookbehind it's treated as a syntax error.It doesn't really matter in this case: if
(?<=^|\A|\s|\b)
would have yielded the desired result,\b
is all you needed anyway. The character following the assertion has to bes
--a word character--so\b
means either (1) the previous character is not a word character, or (2) there is no previous character. That being the case,^
,\A
and\s
are all redundant.However, if the string starts with
!
it's a different story.^
and\A
still match the beginning of the string, before the!
, but\b
matches after it. If you want to match!substring!
as a complete string you have to use/\A!substring!\Z/
, but if you only want to match the whole wordsubstring
you have to use/\bsubstring\b/
.As for
[^\B]
, that just matches any character exceptB
. Like\b
,\B
is a zero-width assertion, and a character class has to match exactly one character. Some regex flavors would throw an exception for the invalid escape sequence\B
, but Ruby (or Oniguruma, more likely) lets it slide.看起来后向的解释是范围 [] 的解释,而不是像先行断言那样的 group () 。这可能意味着 \b 是无效的退格字符而不是单词边界。
当所有其他方法都失败时......使用双重否定!
Looks like the interpretation of the lookbehind is that of a range [] and not a group () like lookahead assertions. That possibly means \b is an invalid backspace character and not a word boundary.
When all else fails... use a double negative!
是的,看起来 Ruby 1.9.2 不支持 \b 和后视。
Yep, looks like Ruby 1.9.2 dosent support \b with look behind.