不包含特定字符串的正则表达式
我有这样的东西
aabbabcaabda
用于选择由 a 包裹的最小组我有这个 /a([^a]*)a/
,效果很好
但是我对由 a 包裹的组有问题>aa,我需要类似的东西 /aa([^aa]*)aa/
不起作用,我不能使用第一个,如 /aa([^a]*)aa/,因为它会在第一次出现 a 时结束,这是我不想要的。
一般来说,有什么办法可以用同样的方式表达不包含字符串 我可以用 [^a]
说不包含字符?
简单地说,我需要 aa 后跟除序列 aa 之外的任何字符,然后以 aa< /em>
I have something like this
aabbabcaabda
for selecting minimal group wrapped by a I have this /a([^a]*)a/
which works just fine
But i have problem with groups wrapped by aa, where I'd need something like/aa([^aa]*)aa/
which doesn't work, and I can't use the first one like /aa([^a]*)aa/
, because it would end on first occurence of a, which I don't want.
Generally, is there any way, how to say not contains string in the same way that
I can say not contains character with [^a]
?
Simply said, I need aa followed by any character except sequence aa and then ends with aa
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我发现2007年的一篇博文其中给出了以下内容匹配不包含某个子字符串的字符串的正则表达式:
它的工作原理如下:它查找零个或多个不开始的(?! - 负向前瞻)字符(。) string 并规定整个字符串必须由此类字符组成(通过使用 ^ 和 $ 锚点)。 或者换句话说:
整个字符串必须由不以给定字符串开头的字符组成,这意味着该字符串不包含给定的子字符串。
I found a blogpost from 2007 which gives the following regex that matches string which don't contains a certain substring:
It works as follows: it looks for zero or more (*) characters (.) which do not begin (?! - negative lookahead) your string and it stipulates that the entire string must be made up of such characters (by using the ^ and $ anchors). Or to put it an other way:
The entire string must be made up of characters which do not begin a given string, which means that the string doesn't contain the given substring.
一般来说,编写不包含特定字符串的正则表达式是很痛苦的。 对于计算模型,我们必须这样做 - 你采用一个很容易定义的 NFA,然后将其简化为正则表达式。 不包含“cat”的事物的表达式大约有 80 个字符长。
编辑:我刚刚完成,是的,它是:
这里是一个非常简短的教程。 我以前发现过一些很棒的东西,但我现在看不到了。
In general it's a pain to write a regular expression not containing a particular string. We had to do this for models of computation - you take an NFA, which is easy enough to define, and then reduce it to a regular expression. The expression for things not containing "cat" was about 80 characters long.
Edit: I just finished and yes, it's:
Here is a very brief tutorial. I found some great ones before, but I can't see them anymore.
您所需要的只是一个不情愿的量词:
您也可以使用负前瞻,但在这种情况下,它只是完成相同任务的更详细的方式。 而且,它比 gpojd 所描述的要复杂一些。 在允许点消耗下一个字符之前,必须在每个位置应用前瞻。
至于 Claudiu 和 finnw 建议的方法,当哨兵字符串只有两个字符长时,它会工作正常,但是(正如 Claudiu 承认的那样)对于较长的字符串来说太笨重了。
All you need is a reluctant quantifier:
You could use negative lookahead, too, but in this case it's just a more verbose way accomplish the same thing. Also, it's a little trickier than gpojd made it out to be. The lookahead has to be applied at each position before the dot is allowed to consume the next character.
As for the approach suggested by Claudiu and finnw, it'll work okay when the sentinel string is only two characters long, but (as Claudiu acknowledged) it's too unwieldy for longer strings.
我不确定这是一个标准结构,但我认为你应该看看“负向前瞻”(它写着:“?!”,不带引号)。
它比该线程中的所有答案(包括已接受的答案)要容易得多。
例子 :
正则表达式:“^(?!123)[0-9]*\w”
捕获以数字开头、后跟字母的任何字符串,除非“这些数字”为 123。
https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference
(微软页面,但相当全面)用于前向/后向
PS:它对我来说效果很好(.Net)。 但如果我做错了什么,请告诉我们。 我发现这个结构非常简单有效,所以我对接受的答案感到惊讶。
I'm not sure it's a standard construct, but I think you should have a look on "negative lookahead" (which writes : "?!", without the quotes).
It's far easier than all answers in this thread, including the accepted one.
Example :
Regex : "^(?!123)[0-9]*\w"
Captures any string beginning by digits followed by letters, UNLESS if "these digits" are 123.
https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference
(Microsoft page, but quite comprehensive) for lookahead / lookbehind
PS : it works well for me (.Net). But if I'm wrong on something, please let us know. I find this construct very simple and effective, so I'm surprised of the accepted answer.
我有以下代码,我必须替换并添加一个 GET 参数到所有对 JS 文件的引用(除了一个)。
这是我使用的正则表达式:
它的作用是查找所有出现的“.js”,如果它们前面有“EXCEPTION”字符串,则从结果数组中丢弃该结果。 这就是所谓的负向回顾。 因为我花了一天的时间来了解如何做到这一点,所以我认为我应该分享。
I had the following code and I had to replace and add a GET-parameter to all references to JS-files EXCEPT one.
This is the regex I used:
What that does is look for all occurences of ".js" and if they are preceeded by the "EXCEPTION" string, discard that result from the result array. That's called negative lookbehind. Since I spent a day on finding out how to do this I thought I should share.
在Java中,这将找到所有以“.ftl”结尾但不以“.inc.ftl”结尾的文件,这正是我想要的。
In Java this will find all files ending in ".ftl" but not ending in ".inc.ftl", which is exactly what I wanted.