重叠组捕获
请看下面的代码:
public static void main(String[] args) {
String s = "a < b > c > d";
String regex = "(\\w\\s*[<>]\\s*\\w)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s);
int i = 0;
while (m.find()) System.out.println(m.group(i++));
}
上述程序的输出是: a
a
a
a
a
a
a
a b、c> d
但我实际上期望 a < b、b> c,c> d。
我的正则表达式有什么问题吗?
Please take a look at the following code:
public static void main(String[] args) {
String s = "a < b > c > d";
String regex = "(\\w\\s*[<>]\\s*\\w)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s);
int i = 0;
while (m.find()) System.out.println(m.group(i++));
}
The output of the above program is: a < b, c > d
But I actually expect a < b, b > c, c > d
.
Anything wrong with my regexp here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你的想法是正确的,b>; c 与正则表达式匹配,因为它确实如此。
但是,当您调用 Matcher::find() 时,它会返回与正则表达式匹配的输入的下一个子字符串 并且 与之前的 find() 匹配不相交。由于“b > c”以“b”开头,而“b”是上一次调用返回的“a > b”匹配的一部分,因此 find() 不会返回它。
You're right in your thinking that b > c matches the regex because it does.
But when you call Matcher::find(), it returns the next substring of the input which matches the regex and is disjoint from previous find() matches. Since "b > c" begins with the 'b' which was part of the "a > b" match returned by the previous invocation, it won't be returned by find().
试试这个。
更新(基于green的解决方案):
Try this.
Updated(Based on green's solution):
基于约翰的解决方案并添加一些边界匹配器,这最终有效。
Based on John's solution and adding some boundary matchers, this works finally.