当前位置：文江博客话题详情

regex regex-negation

如何使用正则表达式找到除某些短语之外的所有内容？

发布于 2024-10-01 08:35:57 字数 771 浏览 3 评论 0 原文

好的，我有一个短语“foo bar”，我想找到除“foo bar”之外的所有内容。
这是我的文字。

ipsum dolor foo bar Lorem ipsum dolor sat amet，
consectetur adipisicing elit, sed do
eiusmod tempor foo bar inciditunt ut Labore et
多洛·福酒吧

有一种方法可以在正则表达式中做到这一点，对吧？我不必去使用字符串等，不是吗？

结果：

注意我无法做一个很好的突出显示，但粗体给了你一个想法（虽然之前和之后的空格也会被选择，但它打破了粗体）。

ipsum dolor foo bar Lorem ipsum dolor sat amet，
consectetur adipisicing elit，sed do
eiusmod tempor foo bar incididunt ut Labore et
dolore foo bar

假设 PCRE 命名法。

更新 7/29/2013：最好使用您的语言的搜索和替换功能选择“删除”您不想要的短语，以便留下您想要的信息。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

江心雾 2024-10-08 08:35:57

一般来说，如果 foobar 匹配自身，则 (?s:(?!foobar).)* 匹配任何不 foobar，什么都不包含。

您可以使用它来查找其中没有 foobar 的行，例如，使用

^(?:(?!foobar).)*$

您也可以使用您语言的 split() 函数在 foobar 上拆分，这将为您提供所有不的部分包括分割模式。

关于讨厌的鲜为人知的回溯控制动词，例如 (*FAIL) 和 (*COMMIT)，我还没有太多机会在“非玩具”中使用它们' 节目。我发现通过 (?>...) 和所有格量词 *+、++、?+ 建立独立子表达式可以这么说，等等给我足够的绳索。

也就是说，我确实有一个在这个答案；这是第一个正则表达式解决方案。它存在的原因是我想强制正则表达式引擎回溯所有可能的排列；真正的目标只是计算它尝试了多少种方法。

请理解，我的两个正则表达式，以及其他人提供的许多非常有创意的答案，都是为了有趣、半开玩笑的。尽管如此，一旦人们从震惊中恢复过来，就可以从他们身上学到很多东西。 ☺

回复收藏 0 原文

下壹個目標 2024-10-08 08:35:57

尝试

^(?!.*foo bar).*$

这个应该选择不包含“foo bar”的每一行。 (?! = 负向前瞻)

try

^(?!.*foo bar).*$

this should select every line that does not contain "foo bar". (?! = negative lookahead)

回复收藏 0 原文

花间憩 2024-10-08 08:35:57

“删除除 foo bar 之外的所有内容”相当于“仅查找 foo bar”，PCRE 非常容易地允许这样做。相反，“查找除 foo bar 之外的所有内容”相当于“仅查找并删除 foo bar”。因此，可以通过您的工具轻松完成补充。

除此之外，PCRE 还有一个令人讨厌的小功能，称为 *FAIL，遇到它时会立即导致回溯。因此，我想在正则表达式中插入类似 (*COMMIT)foo bar(*FAIL) 的内容可能会有所帮助。但它既不友好也不太安全。

回复收藏 0 原文

明月松间行 2024-10-08 08:35:57

好的，您想使用 UltraEdit 的“高级”（Perl 正则表达式样式）搜索功能来删除除 foo bar 之外的所有内容。最简单的方法是匹配所有内容，但仅捕获 foo bar，如下所示：

(?:(?!foo bar).)+(foo bar|$)

...并将其替换为 $1 或\1（UltraEdit 接受的样式）。

我不使用 UltraEdit，但在 EditPadPro 中它将以下内容转换

ipsum dolor foo bar Lorem ipsum dolor sit amet,
consectetur adipisicing elit, sed do
eiusmod tempor foo bar incididunt ut labore et
dolore foo bar

为

foo bar

foo bar
foo bar

： ...这是您在原始消息中显示的结果。

Okay, so you want to remove everything except foo bar using UltraEdit's "Advanced" (Perl-regex style) search feature. The easiest way to do that is to match everything, but only capture foo bar, like this:

(?:(?!foo bar).)+(foo bar|$)

...and replace it with $1 or \1 (whichever style UltraEdit accepts).

I don't use UltraEdit, but in EditPadPro it converts this:

ipsum dolor foo bar Lorem ipsum dolor sit amet,
consectetur adipisicing elit, sed do
eiusmod tempor foo bar incididunt ut labore et
dolore foo bar

...to this:

foo bar

foo bar
foo bar

...which is the result you showed in your original message.

回复收藏 0 原文

软糯酥胸 2024-10-08 08:35:57

这里：perl -pe 's{.*?(foo bar)?}{$1}g'

我想找到除“foo bar”之外的所有内容

不使用 $1 替换的仅匹配模式（可以与 s{pattern}{} 中的空替换一起使用）...不确定这是否可能。您必须吞噬直到 foo bar 为止的字符，例如使用 .*?(?=foo bar)。但随后匹配算法继续并看到“oo bar”，并且会再次匹配，因为没有 f。

继续探索，这里是一段 Perl 代码，它吞噬了请求的部分，唯一的缺点是，如果 foo bar 恰好位于行的开头，则可能会返回空捕获

foreach (<>) {
        chomp;
        @_ = m{(.*?)(?:foo bar|$)}gs;
        print "[[ $_ ]]\n" for @_;
}

：不涉及替换，并且在 Lorem ipsum 测试文件上运行此文件将显示除 foo bar 部分之外的所有内容。它与 PCRE 兼容，但不能保证 $EDITOR 能够实现您的预期。

Here: perl -pe 's{.*?(foo bar)?}{$1}g' <text

I want to find everything BUT "foo bar"

A match-only pattern without using substitution by $1 (that is usable with the empty replacement as in s{pattern}{})... not sure that is possible. You would have to gobble up chars up until foo bar, e.g. with .*?(?=foo bar). But then the matching algorithm continues on and sees "oo bar", and would match again as there is no f.

Continuing the quest, here is a piece of perl code that gobbles up the requested parts, only with the drawback that empty captures may be returned if foo bar happens to be at the start of the line:

foreach (<>) {
        chomp;
        @_ = m{(.*?)(?:foo bar|$)}gs;
        print "[[ $_ ]]\n" for @_;
}

There is no substituion involved and running this on the Lorem ipsum test file will show everything but the foo bar parts. It is PCRE compatible, but there is no guarantees that $EDITOR will does what you envision.

回复收藏 0 原文