重叠组捕获

发布于 2024-10-28 02:28:08 字数 531 浏览 1 评论 0原文

请看下面的代码:

public static void main(String[] args) {
    String s = "a < b > c > d";
    String regex = "(\\w\\s*[<>]\\s*\\w)";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    int i = 0;
    while (m.find()) System.out.println(m.group(i++));
}

上述程序的输出是: a a a a a a a a b、c> d

但我实际上期望 a < b、b> c,c> d。

我的正则表达式有什么问题吗?

Please take a look at the following code:

public static void main(String[] args) {
    String s = "a < b > c > d";
    String regex = "(\\w\\s*[<>]\\s*\\w)";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    int i = 0;
    while (m.find()) System.out.println(m.group(i++));
}

The output of the above program is: a < b, c > d

But I actually expect a < b, b > c, c > d.

Anything wrong with my regexp here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

江心雾 2024-11-04 02:28:08

你的想法是正确的,b>; c 与正则表达式匹配,因为它确实如此。

但是,当您调用 Matcher::find() 时,它会返回与正则表达式匹配的输入的下一个子字符串 并且 与之前的 find() 匹配不相交。由于“b > c”以“b”开头,而“b”是上一次调用返回的“a > b”匹配的一部分,因此 find() 不会返回它。

You're right in your thinking that b > c matches the regex because it does.

But when you call Matcher::find(), it returns the next substring of the input which matches the regex and is disjoint from previous find() matches. Since "b > c" begins with the 'b' which was part of the "a > b" match returned by the previous invocation, it won't be returned by find().

甜心 2024-11-04 02:28:08

试试这个。

    String s = "a < b > c > d";
    String regex = "(?=(\\w{1}\\s{1}[<>]{1}\\s{1}\\w{1})).";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    while(m.find()) {
        System.out.println(m.group(1));
    }

更新(基于green的解决方案)

    String s = " something.js > /some/path/to/x19-v1.0.js < y < z < a > b > c > d";
    String regex = "(?=[\\s,;]+|(?<![\\w\\/\\-\\.])([\\w\\/\\-\\.]+\\s*[<>]\\s*[\\w\\/\\-\\.]+))";

    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    while (m.find()) {
        String d = m.group(1);
        if(d != null) {
            System.out.println(d);
        }
    }

Try this.

    String s = "a < b > c > d";
    String regex = "(?=(\\w{1}\\s{1}[<>]{1}\\s{1}\\w{1})).";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    while(m.find()) {
        System.out.println(m.group(1));
    }

Updated(Based on green's solution):

    String s = " something.js > /some/path/to/x19-v1.0.js < y < z < a > b > c > d";
    String regex = "(?=[\\s,;]+|(?<![\\w\\/\\-\\.])([\\w\\/\\-\\.]+\\s*[<>]\\s*[\\w\\/\\-\\.]+))";

    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    while (m.find()) {
        String d = m.group(1);
        if(d != null) {
            System.out.println(d);
        }
    }
魂ガ小子 2024-11-04 02:28:08

基于约翰的解决方案并添加一些边界匹配器,这最终有效。

    String s = " something.js > /some/path/to/x19-v1.0.js < y < z < a > b > c > d";
    String regex = "(?=[\\s,;]+([\\w\\/\\-\\.]+\\s*[<>]\\s*[\\w\\/\\-\\.]+)[\\s,;$]*).";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    while(m.find()) {
        System.out.println(m.group(1));
    }

Based on John's solution and adding some boundary matchers, this works finally.

    String s = " something.js > /some/path/to/x19-v1.0.js < y < z < a > b > c > d";
    String regex = "(?=[\\s,;]+([\\w\\/\\-\\.]+\\s*[<>]\\s*[\\w\\/\\-\\.]+)[\\s,;$]*).";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(s);
    while(m.find()) {
        System.out.println(m.group(1));
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文