Ruby 1.9 正则表达式 Lookbehind 断言和锚

发布于 2024-09-12 04:33:47 字数 380 浏览 18 评论 0原文

Ruby 1.9 正则表达式支持后向断言,但在模式中传递锚点时我似乎遇到困难。当锚点在前瞻断言中传递时,它运行得很好。

"well substring! "[/(?<=^|\A|\s|\b)substring!(?=$|\Z|\s|\b)/] #=> RegexpError: invalid pattern in look-behind: /(?<=^|\A|\s|\b)substring(?=$|\Z|\s|\b)/

有谁知道如何使锚点在后视断言中像在前视断言中一样工作?

向后查找是否需要特殊的转义序列或分组?

我已在 1.9.1-p243、p376 和 1.9.2-preview3 中测试了此行为,以防万一它被修补。

Ruby 1.9 regex supports lookbehind assertion but I seem to have difficulty when passing anchors in the pattern. When anchors are passed in the lookahead assertion it runs just fine.

"well substring! "[/(?<=^|\A|\s|\b)substring!(?=$|\Z|\s|\b)/] #=> RegexpError: invalid pattern in look-behind: /(?<=^|\A|\s|\b)substring(?=$|\Z|\s|\b)/

Does anybody know how to make anchors work in lookbehind assertions as it does in lookahead?

Is there a special escape sequence or grouping that is required for lookbehind?

I have tested this behavior in 1.9.1-p243, p376 and 1.9.2-preview3 just in case it was patched.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

请恋爱 2024-09-19 04:33:47

看起来你是对的: \b 在前瞻中按预期工作,但在后瞻中它被视为语法错误。

在这种情况下,这并不重要:如果 (?<=^|\A|\s|\b) 会产生所需的结果,\b无论如何,这就是您所需要的。断言后面的字符必须是 s——一个单词字符——所以 \b 意味着 (1) 前一个字符不是一个单词字符,或者 (2) 没有前面的字符。既然如此,^\A\s都是多余的。

但是,如果字符串以 ! 开头,那就是另一回事了。 ^\A 仍然匹配字符串的开头, ! 之前,但 \b 匹配其之后。如果你想将 !substring! 作为一个完整的字符串进行匹配,你必须使用 /\A!substring!\Z/,但如果你只想匹配整个单词substring 您必须使用/\bsubstring\b/

至于[^\B],它只匹配除B之外的任何字符。与 \b 一样,\B 是一种零宽度断言,并且字符类必须精确匹配一个字符。某些正则表达式风格会因无效的转义序列 \B 引发异常,但 Ruby(或 Oniguruma,更有可能)让它滑动。

Looks like you're right: \b works as expected in a lookahead, but in a lookbehind it's treated as a syntax error.

It doesn't really matter in this case: if (?<=^|\A|\s|\b) would have yielded the desired result, \b is all you needed anyway. The character following the assertion has to be s--a word character--so \b means either (1) the previous character is not a word character, or (2) there is no previous character. That being the case, ^, \A and \s are all redundant.

However, if the string starts with ! it's a different story. ^ and \A still match the beginning of the string, before the !, but \b matches after it. If you want to match !substring! as a complete string you have to use /\A!substring!\Z/, but if you only want to match the whole word substring you have to use /\bsubstring\b/.

As for [^\B], that just matches any character except B. Like \b, \B is a zero-width assertion, and a character class has to match exactly one character. Some regex flavors would throw an exception for the invalid escape sequence \B, but Ruby (or Oniguruma, more likely) lets it slide.

爱的十字路口 2024-09-19 04:33:47

看起来后向的解释是范围 [] 的解释,而不是像先行断言那样的 group () 。这可能意味着 \b 是无效的退格字符而不是单词边界。

"well substring! "[/(?<=^|\A|\s|[^\B])substring!(?=$|\Z|\s|\b)/]  #=> substring!
"well substring! "[/(?<=^|\A|\s|[^\B])substring(?=$|\Z|\s|\b)/]   #=> substring
"well !substring! "[/(?<=^|\A|\s|[^\B])substring(?=$|\Z|\s|\b)/]  #=> substring
"well !substring! "[/(?<=^|\A|\s|[^\B])!substring(?=$|\Z|\s|\b)/] #=> !substring

当所有其他方法都失败时......使用双重否定!

Looks like the interpretation of the lookbehind is that of a range [] and not a group () like lookahead assertions. That possibly means \b is an invalid backspace character and not a word boundary.

"well substring! "[/(?<=^|\A|\s|[^\B])substring!(?=$|\Z|\s|\b)/]  #=> substring!
"well substring! "[/(?<=^|\A|\s|[^\B])substring(?=$|\Z|\s|\b)/]   #=> substring
"well !substring! "[/(?<=^|\A|\s|[^\B])substring(?=$|\Z|\s|\b)/]  #=> substring
"well !substring! "[/(?<=^|\A|\s|[^\B])!substring(?=$|\Z|\s|\b)/] #=> !substring

When all else fails... use a double negative!

我的鱼塘能养鲲 2024-09-19 04:33:47

是的,看起来 Ruby 1.9.2 不支持 \b 和后视。

ruby-1.9.2-p180 :034 > "See Jeffs book and it seems fine!".gsub(/(?=s\b)(?<=\bJeff)/,"'")
SyntaxError: (irb):34: invalid pattern in look-behind: /(?=s\b)(?<=\bJeff)/
from /home/pratikk/.rvm/rubies/ruby-1.9.2-p136/bin/irb:16:in `<main>'

ruby-1.9.2-p180 :033 > "See Jeffs book and it seems fine!".gsub(/(?=s\b)(?<=Jeff)/,"'")
 => "See Jeff's book and it seems fine!" 

Yep, looks like Ruby 1.9.2 dosent support \b with look behind.

ruby-1.9.2-p180 :034 > "See Jeffs book and it seems fine!".gsub(/(?=s\b)(?<=\bJeff)/,"'")
SyntaxError: (irb):34: invalid pattern in look-behind: /(?=s\b)(?<=\bJeff)/
from /home/pratikk/.rvm/rubies/ruby-1.9.2-p136/bin/irb:16:in `<main>'

ruby-1.9.2-p180 :033 > "See Jeffs book and it seems fine!".gsub(/(?=s\b)(?<=Jeff)/,"'")
 => "See Jeff's book and it seems fine!" 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文