量词范围在lookbehind中不起作用

发布于 2024-10-16 16:01:24 字数 435 浏览 7 评论 0原文

好的,我正在开发一个项目,我需要一个正则表达式,可以匹配 * 后跟 1-4 个空格或制表符,然后是一行文本。现在我在回溯之后使用 .* 来进行测试。但是我可以让它显式匹配 1、2 或 4 个空格/制表符,但不能匹配 1-4 个。我正在针对以下块进行测试

*    test line here
*   Second test
*  Third test
* Another test

,这些是我正在测试的两种模式 (?<=(\*[ \t]{3})).* 其工作原理与预期一致匹配第二行,如果我用 1、2 或 4 替换 3,则相同,但是如果我用 1,4 替换它,形成以下模式 (?<=(\*[ \t]{1,4}) ).* 它不再匹配任何行,老实说我不明白为什么。我尝试过谷歌搜索但没有成功。我正在使用 g(lobal) 标志。

Okay so I'm working on a project where I need a regex that can match a * followed by 1-4 spaces or tabs and then followed by a row of text. Right now I'm using .* after the lookbehind for testing purposes. However I can get it to match explicitly 1, 2, or 4 spaces/tabs but not 1-4. I'm testing against the following block

*    test line here
*   Second test
*  Third test
* Another test

And these are the two patterns I'm testing (?<=(\*[ \t]{3})).* which works just as expected and matches the 2nd line, same if I replace 3 with 1, 2 or 4 however if I replace it with 1,4 forming the following pattern (?<=(\*[ \t]{1,4})).* it no longer matches any of the rows and I honestly can't understand why. I've tried googling without success. I'm using the g(lobal) flag.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

彩虹直至黑白 2024-10-23 16:01:24

PHP 与许多风格一样,不支持可变长度后向查找。唯一的支持是后行顶层的交替 (|)。即使是 ? 也可以打破这种模式。另一种方法是使用:

(?<=\*[ \t]|\*[ \t]{2}|\*[ \t]{3}|\*[ \t]{4}).*

或者更好的是,中止组的后视:

\*[ \t]{1,4}(.*)

这应该对您很有用,因为无论如何您的匹配项似乎都没有重叠。

从手册:

lookbehind 断言的内容受到限制,因此它匹配的所有字符串都必须具有固定长度。然而,如果有多个替代方案,则它们不必都具有相同的固定长度。因此 (?<=bullock|donkey) 是允许的,但 (?

来源: http://www.php.net/manual/en/regexp .reference.assertions.php

PHP, like many flavors, doesn't support variable length lookbehind. The only support is alternation (|) at the top level of the lookbehind. Even a ? can break the pattern. An alternative is to use:

(?<=\*[ \t]|\*[ \t]{2}|\*[ \t]{3}|\*[ \t]{4}).*

Or better, abort the lookbehind for a group:

\*[ \t]{1,4}(.*)

This should work well for you, since it doesn't seem like you have overlapping of your matches anyway.

From the manual:

The contents of a lookbehind assertion are restricted such that all the strings it matches must have a fixed length. However, if there are several alternatives, they do not all have to have the same fixed length. Thus (?<=bullock|donkey) is permitted, but (?<!dogs?|cats?) causes an error at compile time. Branches that match different length strings are permitted only at the top level of a lookbehind assertion.

Source: http://www.php.net/manual/en/regexp.reference.assertions.php

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文