Java RegEx 不区分大小写吗?

发布于 2024-09-13 13:41:50 字数 287 浏览 10 评论 0原文

在Java中,当执行replaceAll来查找正则表达式模式时,例如:(

replaceAll("\\?i\\b(\\w+)\\b(\\s+\\1)+\\b", "$1"); 

删除重复的连续的不区分大小写的单词,例如Test测试),我不确定将?i放在哪里。我读到它应该在开头,但是如果我把它拿出来,那么我会捕获重复的连续单词(例如测试测试),但不会捕获不区分大小写的单词(例如测试测试)。所以我想我可以在开头添加 ?i 但这似乎并不能完成工作。有什么想法吗?谢谢!

In Java, when doing a replaceAll to look for a regex pattern like:

replaceAll("\\?i\\b(\\w+)\\b(\\s+\\1)+\\b", "$1"); 

(to remove duplicate consecutive case-insensitive words, e.g. Test test), I'm not sure where I put the ?i. I read that it is supposed to be at the beginning, but if I take it out then i catch duplicate consecutive words (e.g. test test), but not case-insensitive words (e.g. Test test). So I thought I could add the ?i in the beginning but that does not seem to get the job done. Any thoughts? Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

夏天碎花小短裙 2024-09-20 13:41:50

您还可以匹配不区分大小写的正则表达式,并使用 Pattern.CASE_INSENSITIVE 常量使其更具可读性,例如:

Pattern mypattern = Pattern.compile(MYREGEX, Pattern.CASE_INSENSITIVE);
Matcher mymatcher= mypattern.matcher(mystring);

You can also match case insensitive regexs and make it more readable by using the Pattern.CASE_INSENSITIVE constant like:

Pattern mypattern = Pattern.compile(MYREGEX, Pattern.CASE_INSENSITIVE);
Matcher mymatcher= mypattern.matcher(mystring);
撑一把青伞 2024-09-20 13:41:50

RegexBuddy 告诉我如果你想在开头包含它,这是正确的语法:

"(?i)\\b(\\w+)\\b(\\s+\\1)+\\b"

RegexBuddy is telling me if you want to include it at the beginning, this is the correct syntax:

"(?i)\\b(\\w+)\\b(\\s+\\1)+\\b"
平安喜乐 2024-09-20 13:41:50

是的,可以在 Java 正则表达式中随意启用和禁用不区分大小写。

看起来您想要这样的东西:

    System.out.println(
        "Have a meRry MErrY Christmas ho Ho hO"
            .replaceAll("(?i)\\b(\\w+)(\\s+\\1)+\\b", "$1")
    );
    // Have a meRry Christmas ho

请注意嵌入 Pattern.CASE_INSENSITIVE 标志是 (?i) 而不是 \?i。另请注意,一个多余的 \b 已从模式中删除。

(?i) 放置在模式的开头以启用不区分大小写。在这种特殊情况下,它不会在模式的后面被覆盖,因此实际上整个模式不区分大小写。

值得注意的是,事实上您可以将不区分大小写限制为整个模式的一部分。因此,将其放在哪里的问题实际上取决于规范(尽管对于这个特定问题来说并不重要,因为 \w 不区分大小写。

为了演示,这里有一个类似的折叠运行示例像 "AaAaaA""A"

    System.out.println(
        "AaAaaA eeEeeE IiiIi OoooOo uuUuUuu"
            .replaceAll("(?i)\\b([A-Z])\\1+\\b", "$1")
    ); // A e I O u

现在假设我们指定只有以大写字母开头的行才应该折叠。将 (?i) 放在适当的位置:

    System.out.println(
        "AaAaaA eeEeeE IiiIi OoooOo uuUuUuu"
            .replaceAll("\\b([A-Z])(?i)\\1+\\b", "$1")
    ); // A eeEeeE I O uuUuUuu

更一般地,您可以根据需要启用和禁用模式中的任何标志。

另请参阅

相关问题

Yes, case insensitivity can be enabled and disabled at will in Java regex.

It looks like you want something like this:

    System.out.println(
        "Have a meRry MErrY Christmas ho Ho hO"
            .replaceAll("(?i)\\b(\\w+)(\\s+\\1)+\\b", "$1")
    );
    // Have a meRry Christmas ho

Note that the embedded Pattern.CASE_INSENSITIVE flag is (?i) not \?i. Note also that one superfluous \b has been removed from the pattern.

The (?i) is placed at the beginning of the pattern to enable case-insensitivity. In this particular case, it is not overridden later in the pattern, so in effect the whole pattern is case-insensitive.

It is worth noting that in fact you can limit case-insensitivity to only parts of the whole pattern. Thus, the question of where to put it really depends on the specification (although for this particular problem it doesn't matter since \w is case-insensitive.

To demonstrate, here's a similar example of collapsing runs of letters like "AaAaaA" to just "A".

    System.out.println(
        "AaAaaA eeEeeE IiiIi OoooOo uuUuUuu"
            .replaceAll("(?i)\\b([A-Z])\\1+\\b", "$1")
    ); // A e I O u

Now suppose that we specify that the run should only be collapsed only if it starts with an uppercase letter. Then we must put the (?i) in the appropriate place:

    System.out.println(
        "AaAaaA eeEeeE IiiIi OoooOo uuUuUuu"
            .replaceAll("\\b([A-Z])(?i)\\1+\\b", "$1")
    ); // A eeEeeE I O uuUuUuu

More generally, you can enable and disable any flag within the pattern as you wish.

See also

Related questions

不爱素颜 2024-09-20 13:41:50

如果整个表达式不区分大小写,则只需指定 CASE_INSENSITIVE 标志:

Pattern.compile(regexp, Pattern.CASE_INSENSITIVE)

If your whole expression is case insensitive, you can just specify the CASE_INSENSITIVE flag:

Pattern.compile(regexp, Pattern.CASE_INSENSITIVE)
极致的悲 2024-09-20 13:41:50

您还可以将要检查模式匹配的初始字符串改为小写。并分别在您的模式中使用小写符号。

You also can lead your initial string, which you are going to check for pattern matching, to lower case. And use in your pattern lower case symbols respectively.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文