无界向后查找的理论含义是什么?

发布于 2024-09-12 01:13:36 字数 203 浏览 2 评论 0原文

大多数语言都允许固定长度或有限长度的lookbehind。一个值得注意的例外是 .NET,它允许使用 * 运算符。

但是,.NET 正则表达式已经可以使用命名捕获来识别平衡括号,这不是常规语言。正则表达式在后向查找中是否仍然带有 * ?除 * 之外的子表达式的扩展答案(例如,额外的环视!)也将受到赞赏。

tl;dr:正则表达式在回溯中是否与 * 保持常规?

Most languages allow fixed-length or finite-length lookbehind. One notable exception is .NET, which allows the use of the * operator.

However, .NET regexs can already recognize balanced parentheses using named capture, which is not a regular language. Are regexs still regular with * in lookbehind? Extended answers for subexpressions other than * (for example, additional lookaround!) would also be appreciated.

tl;dr: Do regexs stay regular with * in lookbehind?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

初吻给了烟 2024-09-19 01:13:36

我相信这里的答案: Lookaround 是否会影响正则表达式可以匹配哪些语言? 可以扩展以证明在lookbehind 中添加*(甚至嵌套这样的lookbehind 和lookaheads)不会影响表达式的“正则性”。不过我还没有考虑更多。

希望有帮助!

I believe the answer here: Does lookaround affect which languages can be matched by regular expressions? can be extended to prove that adding * in lookbehind (or even nesting such lookbehinds and lookaheads) does not affect the 'regularness' of the expressions. I haven't put more thought into it though.

Hope that helps!

三人与歌 2024-09-19 01:13:36

.NET 的无限回顾只是对已经非常规功能的改进:固定、有限或无限,回顾在常规语法中没有地位。前瞻、捕获组、反向引用、不情愿的量词、所有格量词、原子组、条件、词边界、锚点……

如果我们必须将自己限制在理论上纯正则表达式,那么当前 99.9% 的正则表达式用户将没有用处对于他们来说。询问某个功能是否“常规”简直是浪费口舌。它有用吗?这才是最重要的。

.NET's unbounded lookbehind is merely a refinement of an already non-regular feature: fixed, finite or infinite, lookbehinds have no place in a regular grammar. Nor do lookaheads, capturing groups, backreferences, reluctant quantifiers, possessive quantifiers, atomic groups, conditionals, word boundaries, anchors...

If we had to limit ourselves to theoretically-pure regular expressions, 99.9% of current regex users would have no use for them. Asking if a feature is "regular" is a waste of breath; is it useful? That's all that matters.

转角预定愛 2024-09-19 01:13:36

正则表达式在交集下是封闭的。添加新符号&并重写lookbehind:
A(?

B 可以明确使用任何不超出 A/C 边界的内容。也就是说,除了前瞻之外的任何内容。如果后向查找可能使用先行查找,或者反之亦然,会发生什么情况?开始工作 .*BC 。你还是很好。

因此,正则表达式确实可以添加交集和无限长度环视(可以包括对任何深度的更多环视),并且它仍然一样高效。

Regular expressions are closed under intersection. Add a new symbol & and rewrite the lookbehind:
A(?<B)C as
(?:AC&.*BC), and we get that lookbehind is regular.

B can include clearly use anything that doesn't go past the A/C boundry. That is, anything except lookahead. What happens if lookbehind may use lookahead, or vice-versa? Start work on .*BC . You're still fine.

So, regular expressions could really add in intersection and infinite-length lookaround (which can include more lookaround to any depth) and it would still be just as efficient.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文