正则表达式直到但不包括
对于正则表达式,搜索直到但不包含的语法是什么?有点像:
Haystack:
The quick red fox jumped over the lazy brown dog
Expression:
.*?quick -> and then everything until it hits the letter "z" but do not include z
For regex what is the syntax for search until but not including? Kinda like:
Haystack:
The quick red fox jumped over the lazy brown dog
Expression:
.*?quick -> and then everything until it hits the letter "z" but do not include z
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
“搜索直到
X
但不包括X
”的明确表达方式是:其中
X
可以是任何正则表达式。不过,就您而言,这可能有点过头了 - 这里最简单的方法是
这将匹配除
z
之外的任何内容,因此在下一个z
之前停止。因此
.*?quick[^z]*
将匹配The Quick Fox Jumps over the la
。但是,一旦您需要查找多个简单字母,
(?:(?!X).)*
就会发挥作用,例如(?:(?! lazy).)*
- 匹配直到单词lazy
开头的任何内容。这是使用 lookahead 断言,更具体地说是否定的lookahead 。
.*?quick(?:(?!lazy).)*
将匹配The Quick Fox Jumps over the
。说明:
此外,在搜索关键字时,您可能需要用单词边界锚点将它们包围起来:
\bfox\b
只会匹配完整的单词fox 但不是
foxy
中的狐狸。注意
如果要匹配的文本还可以包含换行符,则需要设置正则表达式引擎的“点匹配全部”选项。通常,您可以通过在正则表达式前面添加
(?s)
来实现这一点,但这并不适用于所有正则表达式引擎(尤其是 JavaScript)。替代解决方案:
在许多情况下,您还可以使用更简单、更易读的解决方案,该解决方案使用惰性量词。通过将
?
添加到*
量词,它将尝试从当前位置匹配尽可能少的字符:将匹配任意数量的字符,在
X
(可以是任何正则表达式)或字符串结尾(如果X
不匹配)。您可能还需要设置“点匹配全部”选项才能使其工作。 (注意:我在X
周围添加了一个非捕获组,以便可靠地将其与交替隔离)The explicit way of saying "search until
X
but not includingX
" is:where
X
can be any regular expression.In your case, though, this might be overkill - here the easiest way would be
This will match anything except
z
and therefore stop right before the nextz
.So
.*?quick[^z]*
will matchThe quick fox jumps over the la
.However, as soon as you have more than one simple letter to look out for,
(?:(?!X).)*
comes into play, for example(?:(?!lazy).)*
- match anything until the start of the wordlazy
.This is using a lookahead assertion, more specifically a negative lookahead.
.*?quick(?:(?!lazy).)*
will matchThe quick fox jumps over the
.Explanation:
Furthermore, when searching for keywords, you might want to surround them with word boundary anchors:
\bfox\b
will only match the complete wordfox
but not the fox infoxy
.Note
If the text to be matched can also include linebreaks, you will need to set the "dot matches all" option of your regex engine. Usually, you can achieve that by prepending
(?s)
to the regex, but that doesn't work in all regex engines (notably JavaScript).Alternative solution:
In many cases, you can also use a simpler, more readable solution that uses a lazy quantifier. By adding a
?
to the*
quantifier, it will try to match as few characters as possible from the current position:will match any number of characters, stopping right before
X
(which can be any regex) or the end of the string (ifX
doesn't match). You may also need to set the "dot matches all" option for this to work. (Note: I added a non-capturing group aroundX
in order to reliably isolate it from the alternation)前瞻性正则表达式语法可以帮助您实现目标。因此,您的示例的正则表达式是
重要的是要注意
(?=z)
前瞻之前的.*?
延迟匹配:表达式匹配子字符串,直到 第一次出现z
字母。这是 C# 代码示例:
A lookahead regex syntax can help you to achieve your goal. Thus a regex for your example is
And it's important to notice the
.*?
lazy matching before the(?=z)
lookahead: the expression matches a substring until a first occurrence of thez
letter.Here is C# code sample:
试试这个
Try this