Java RegEx 前瞻失败
在 Java 中,我无法让正则表达式按照我想要的方式运行,因此编写了这个小 JUnit 测试来演示该问题:
public void testLookahead() throws Exception {
Pattern p = Pattern.compile("ABC(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find());
p = Pattern.compile("[A-Z]{3}(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find());
p = Pattern.compile("[A-Z]{3}(?!!)", Pattern.CASE_INSENSITIVE);
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find()); //fails, why?
p = Pattern.compile("[A-Za-z]{3}(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find()); //fails, why?
}
除了标有注释的两行之外,每一行都通过。除了模式字符串之外,分组是相同的。为什么添加不区分大小写会破坏匹配器?
In Java, I was unable to get a regex to behave the way I wanted, and wrote this little JUnit test to demonstrate the problem:
public void testLookahead() throws Exception {
Pattern p = Pattern.compile("ABC(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find());
p = Pattern.compile("[A-Z]{3}(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find());
p = Pattern.compile("[A-Z]{3}(?!!)", Pattern.CASE_INSENSITIVE);
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find()); //fails, why?
p = Pattern.compile("[A-Za-z]{3}(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find()); //fails, why?
}
Every line passes except for the two marked with the comment. The groupings are identical except for pattern string. Why would adding case-insensitivity break the matcher?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的测试失败,因为在这两种情况下,模式
[AZ]{3}(?!!)
(带有CASE_INSENSITIVE
)和[A-Za-z ]{3}(?!!)
在"blah/ABC!/blah"
中至少找到一个匹配项(他们找到bla
两次)。一个简单的测试表明了这一点:
打印:
Your tests fail, because in both cases, the pattern
[A-Z]{3}(?!!)
(withCASE_INSENSITIVE
) and[A-Za-z]{3}(?!!)
find at least one match in"blah/ABC!/blah"
(they findbla
twice).A simple tests shows this:
prints:
这两个不会抛出错误值,因为完整字符串中有与模式匹配的子字符串。具体来说,字符串
blah
与正则表达式匹配(后面不跟感叹号的三个字母)。区分大小写的错误会正确失败,因为blah
不是大写。Those two don't throw false values because there are substrings within the full string that match the pattern. Specifically, the string
blah
matches the regular expression (three letters not followed by an exclamation mark). The case-sensitive ones correctly fail becauseblah
isn't upper-case.