正则表达式不是运算符

发布于 2024-12-03 12:17:38 字数 354 浏览 0 评论 0原文

正则表达式中是否有 NOT 运算符? 就像该字符串: "(2001) (asdf) (dasd1123_asd 21.01.2011 zqge)(dzqge) name (20019)"

我想删除所有 \([0-9a-zA -z _\.\-:]*\) 但不是年份: <代码>(2001)。

所以正则表达式应该返回的内容必须是:(2001) name

注意:类似 \((?![\d]){4}[0-9a-zA-z _\.\-:]*\) 对我不起作用(< code>(20019) 不知何故也匹配...)

Is there an NOT operator in Regexes?
Like in that string : "(2001) (asdf) (dasd1123_asd 21.01.2011 zqge)(dzqge) name (20019)"

I want to delete all \([0-9a-zA-z _\.\-:]*\) but not the one where it is a year: (2001).

So what the regex should return must be: (2001) name.

NOTE: something like \((?![\d]){4}[0-9a-zA-z _\.\-:]*\) does not work for me (the (20019) somehow also matches...)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

甜点 2024-12-10 12:17:38

不完全是这样,尽管通常您通常可以在其中一种形式

  • [^abc] 上使用一些解决方法,它是逐个字符而不是 ab > 或 c
  • 或负向前瞻:a(?!b),即 a 后面没有 b
  • 或消极的回顾: (?,即 b 前面没有 a

Not quite, although generally you can usually use some workaround on one of the forms

  • [^abc], which is character by character not a or b or c,
  • or negative lookahead: a(?!b), which is a not followed by b
  • or negative lookbehind: (?<!a)b, which is b not preceeded by a
那小子欠揍 2024-12-10 12:17:38

不,没有直接的 not 运算符。至少不是你希望的那样。

但是,您可以使用零宽度负前瞻:

\((?!2001)[0-9a-zA-z _\.\-:]*\)

(?!...) 部分表示“仅匹配以下文本(因此:前瞻)此 < 消耗它匹配的字符(因此:零宽度)。

em>不(因此:负)与此匹配,但它实际上 一个href="http://www.regular-expressions.info/lookaround.html" rel="noreferrer">lookarounds 具有 2 个轴:

  • lookbehind /lookahead :指定字符是否之前之后点被视为
  • 正/负:指定字符是否必须匹配或不得匹配。

No, there's no direct not operator. At least not the way you hope for.

You can use a zero-width negative lookahead, however:

\((?!2001)[0-9a-zA-z _\.\-:]*\)

The (?!...) part means "only match if the text following (hence: lookahead) this doesn't (hence: negative) match this. But it doesn't actually consume the characters it matches (hence: zero-width).

There are actually 4 combinations of lookarounds with 2 axes:

  • lookbehind / lookahead : specifies if the characters before or after the point are considered
  • positive / negative : specifies if the characters must match or must not match.
猫九 2024-12-10 12:17:38

您可以捕获 (2001) 部分并将其余部分替换为任何内容。

public static string extractYearString(string input) {
    return input.replaceAll(".*\(([0-9]{4})\).*", "$1");
}

var subject = "(2001) (asdf) (dasd1123_asd 21.01.2011 zqge)(dzqge) name (20019)";
var result = extractYearString(subject);
System.out.println(result); // <-- "2001"

.*\(([0-9]{4})\).* 表示

  • .* 匹配任何内容
  • \( 匹配 ( 字符
  • ( 开始捕获
  • [0-9]{4} 任何单个数字四次
  • ) 结束捕获
  • \) 匹配 ) 字符
  • .* 任何内容(字符串的其余部分)

You could capture the (2001) part and replace the rest with nothing.

public static string extractYearString(string input) {
    return input.replaceAll(".*\(([0-9]{4})\).*", "$1");
}

var subject = "(2001) (asdf) (dasd1123_asd 21.01.2011 zqge)(dzqge) name (20019)";
var result = extractYearString(subject);
System.out.println(result); // <-- "2001"

.*\(([0-9]{4})\).* means

  • .* match anything
  • \( match a ( character
  • ( begin capture
  • [0-9]{4} any single digit four times
  • ) end capture
  • \) match a ) character
  • .* anything (rest of string)
无畏 2024-12-10 12:17:38

这是一种替代方案:

(\(\d{4}\))((?:\s*\([0-9a-zA-z _\.\-:]*\))*)([^()]*)(( ?\([0-9a-zA-z _\.\-:]*\))*)

使用此结构将重复模式嵌入到单个组中,其中内部组不是捕获组:((:?pattern)*),这使得能够控制感兴趣的组数。

然后你就可以得到你想要的: \1\3

Here is an alternative:

(\(\d{4}\))((?:\s*\([0-9a-zA-z _\.\-:]*\))*)([^()]*)(( ?\([0-9a-zA-z _\.\-:]*\))*)

Repetitive patterns are embedded in a single group with this construction, where the inner group is not a capturing one: ((:?pattern)*), which enable to have control on the group numbers of interrest.

Then you get what you want with: \1\3

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文