正则表达式直到但不包括

发布于 2024-09-25 21:23:36 字数 256 浏览 2 评论 0原文

对于正则表达式,搜索直到但不包含的语法是什么?有点像:

Haystack:
The quick red fox jumped over the lazy brown dog

Expression:
.*?quick -> and then everything until it hits the letter "z" but do not include z

For regex what is the syntax for search until but not including? Kinda like:

Haystack:
The quick red fox jumped over the lazy brown dog

Expression:
.*?quick -> and then everything until it hits the letter "z" but do not include z

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

一口甜 2024-10-02 21:23:36

“搜索直到 X 但不包括 X”的明确表达方式是:

(?:(?!X).)*

其中 X 可以是任何正则表达式。

不过,就您而言,这可能有点过头了 - 这里最简单的方法是

[^z]*

这将匹配除 z 之外的任何内容,因此在下一个 z 之前停止。

因此 .*?quick[^z]* 将匹配 The Quick Fox Jumps over the la

但是,一旦您需要查找多个简单字母,(?:(?!X).)* 就会发挥作用,例如

(?:(?! lazy).)* - 匹配直到单词 lazy 开头的任何内容。

这是使用 lookahead 断言,更具体地说是否定的lookahead 。

.*?quick(?:(?!lazy).)* 将匹配 The Quick Fox Jumps over the

说明:

(?:        # Match the following but do not capture it:
 (?!lazy)  # (first assert that it's not possible to match "lazy" here
 .         # then match any character
)*         # end of group, zero or more repetitions.

此外,在搜索关键字时,您可能需要用单词边界锚点将它们包围起来:\bfox\b 只会匹配完整的单词 fox 但不是 foxy 中的狐狸。

注意

如果要匹配的文本还可以包含换行符,则需要设置正则表达式引擎的“点匹配全部”选项。通常,您可以通过在正则表达式前面添加 (?s) 来实现这一点,但这并不适用于所有正则表达式引擎(尤其是 JavaScript)。

替代解决方案:

在许多情况下,您还可以使用更简单、更易读的解决方案,该解决方案使用惰性量词。通过将 ? 添加到 * 量词,它将尝试从当前位置匹配尽可能少的字符:

.*?(?=(?:X)|$)

将匹配任意数量的字符,在 X (可以是任何正则表达式)或字符串结尾(如果 X 不匹配)。您可能还需要设置“点匹配全部”选项才能使其工作。 (注意:我在 X 周围添加了一个非捕获组,以便可靠地将其与交替隔离)

The explicit way of saying "search until X but not including X" is:

(?:(?!X).)*

where X can be any regular expression.

In your case, though, this might be overkill - here the easiest way would be

[^z]*

This will match anything except z and therefore stop right before the next z.

So .*?quick[^z]* will match The quick fox jumps over the la.

However, as soon as you have more than one simple letter to look out for, (?:(?!X).)* comes into play, for example

(?:(?!lazy).)* - match anything until the start of the word lazy.

This is using a lookahead assertion, more specifically a negative lookahead.

.*?quick(?:(?!lazy).)* will match The quick fox jumps over the.

Explanation:

(?:        # Match the following but do not capture it:
 (?!lazy)  # (first assert that it's not possible to match "lazy" here
 .         # then match any character
)*         # end of group, zero or more repetitions.

Furthermore, when searching for keywords, you might want to surround them with word boundary anchors: \bfox\b will only match the complete word fox but not the fox in foxy.

Note

If the text to be matched can also include linebreaks, you will need to set the "dot matches all" option of your regex engine. Usually, you can achieve that by prepending (?s) to the regex, but that doesn't work in all regex engines (notably JavaScript).

Alternative solution:

In many cases, you can also use a simpler, more readable solution that uses a lazy quantifier. By adding a ? to the * quantifier, it will try to match as few characters as possible from the current position:

.*?(?=(?:X)|$)

will match any number of characters, stopping right before X (which can be any regex) or the end of the string (if X doesn't match). You may also need to set the "dot matches all" option for this to work. (Note: I added a non-capturing group around X in order to reliably isolate it from the alternation)

层林尽染 2024-10-02 21:23:36

前瞻性正则表达式语法可以帮助您实现目标。因此,您的示例的正则表达式是

.*?quick.*?(?=z)

重要的是要注意 (?=z) 前瞻之前的 .*? 延迟匹配:表达式匹配子字符串,直到 第一次出现z字母。

这是 C# 代码示例:

const string text = "The quick red fox jumped over the lazy brown dogz";

string lazy = new Regex(".*?quick.*?(?=z)").Match(text).Value;
Console.WriteLine(lazy); // The quick red fox jumped over the la

string greedy = new Regex(".*?quick.*(?=z)").Match(text).Value;
Console.WriteLine(greedy); // The quick red fox jumped over the lazy brown dog

A lookahead regex syntax can help you to achieve your goal. Thus a regex for your example is

.*?quick.*?(?=z)

And it's important to notice the .*? lazy matching before the (?=z) lookahead: the expression matches a substring until a first occurrence of the z letter.

Here is C# code sample:

const string text = "The quick red fox jumped over the lazy brown dogz";

string lazy = new Regex(".*?quick.*?(?=z)").Match(text).Value;
Console.WriteLine(lazy); // The quick red fox jumped over the la

string greedy = new Regex(".*?quick.*(?=z)").Match(text).Value;
Console.WriteLine(greedy); // The quick red fox jumped over the lazy brown dog
不语却知心 2024-10-02 21:23:36

试试这个

(.*?quick.*?)z

Try this

(.*?quick.*?)z
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文