匹配n的发生(排除最后一次发生)
我有一个关于正则的问题。我不知道为什么我不能做以下内容。
示例句子:
"This is a test string with five t's"
我使用的正则句:
^(.*?(?=t)){3}
我希望正则态度匹配以下内容。
"This is a test s"
但是它行不通,有人知道为什么吗?
I have a question about regex. I don't know why I cannot do the following.
Sample sentence:
"This is a test string with five t's"
The regex I use:
^(.*?(?=t)){3}
I want the regex to match the following.
"This is a test s"
But it doesn't work, does anyone know why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这里的要点是整个
.*?(?=t)
组模式可以匹配空字符串。它在第一个t
之前停止,并且无法“跳跃通过”,因为当先行模式(非消耗模式)匹配时它仍保持在原来的位置。你不能这样做,你必须消耗(并移动正则表达式索引)至少一个字符。
此具体案例的替代解决方案是
参见 正则表达式演示,
^(? :[^t]*t){2}[^t]*
匹配字符串的开头 (^
),然后消耗两次出现的位置 ({2}
) 任何其他字符比t
([^t]*
) 后跟t
,然后再次消耗两次出现 ({2}
) 除t
之外的任何字符。或者,一般情况解决方案(如果
t
是多字符字符串):请参阅 另一个正则表达式演示。
(?:.*?t){2}
模式匹配任意 0+ 字符的两次出现,尽可能少,直到第一个t
,然后 < code>(?:(?!t).)* 匹配出现 0 次以上且不启动t
字符序列的任何字符。The point here is that the whole
.*?(?=t)
group pattern can match an empty string. It stops before the firstt
and cannot "hop thru" because it remains where it is when the lookahead pattern (a non-consuming pattern) matches.You cannot do it like this, you must consume (and move the regex index) at least one char.
An alternative solution for this concrete case is
See the regex demo, the
^(?:[^t]*t){2}[^t]*
matches the start of string (^
), then consumes two occurrences ({2}
) of any chars other thant
([^t]*
) followed witht
, and then again consumes two occurrences ({2}
) of any chars other thant
.Or, a general case solution (if
t
is a multicharacter string):See another regex demo. The
(?:.*?t){2}
pattern matches two occurrences of any 0+ chars, as few as possible, up to the firstt
, and then(?:(?!t).)*
matches any char, 0+ occurrences, that does not start at
char sequence.正如 @CertainPerformance 所说,
.*
将匹配模式中的零个或多个字符,但是您使用其惰性版本.*?
。量词的惰性版本将使其匹配尽可能少的字符。
使用与空字符串匹配的量词,这将始终导致零长度匹配。
您需要使用
+
量词代替`,以防止空字符串匹配。用Python演示:
As said by @CertainPerformance,
.*
will match zero or more characters in the pattern, but you use its lazy version.*?
.The lazy version of a quantifier will have it match as few characters as possible.
With a quantifier that matches the empty string, this will always lead to a zero-length match.
You need to use the
+
quantifier instead`, in order to prevent an empty string match.Demonstration with Python: