我可以替换 Java 正则表达式中的组吗？

发布于 2024-07-24 09:37:30 字数 363 浏览 5 评论 0原文

我有这段代码，我想知道是否可以仅替换 Java 正则表达式中的组（而不是所有模式）。代码：

 //...
 Pattern p = Pattern.compile("(\\d).*(\\d)");
    String input = "6 example input 4";
    Matcher m = p.matcher(input);
    if (m.find()) {

        //Now I want replace group one ( (\\d) ) with number 
       //and group two (too (\\d) ) with 1, but I don't know how.

    }

原文

I have this code, and I want to know, if I can replace only groups (not all pattern) in Java regex.
Code:

 //...
 Pattern p = Pattern.compile("(\\d).*(\\d)");
    String input = "6 example input 4";
    Matcher m = p.matcher(input);
    if (m.find()) {

        //Now I want replace group one ( (\\d) ) with number 
       //and group two (too (\\d) ) with 1, but I don't know how.

    }

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

十级心震 2024-07-31 09:37:30

使用 $n （其中 n 是数字）来引用 replaceFirst(...)。我假设您想用文字字符串 "number" 替换第一组，用第一组的值替换第二组。

Pattern p = Pattern.compile("(\\d)(.*)(\\d)");
String input = "6 example input 4";
Matcher m = p.matcher(input);
if (m.find()) {
    // replace first number with "number" and second number with the first
    // the added group ("(.*)" which is $2) captures unmodified text to include it in the result
    String output = m.replaceFirst("number$2$1"); // "number example input 6"
}

考虑使用 (\D+) 作为第二组，而不是 (.*)。 * 是一个贪婪匹配器，首先会消耗最后一个数字。当匹配器意识到最终的(\d) 没有任何内容可匹配时，它必须回溯，然后才能匹配到最终的数字。

编辑

多年后，这个问题仍然得到投票，评论和编辑（打破了答案）表明，人们对这个问题的含义仍然存在困惑。我已经修复了它，并添加了急需的示例输出。

对替换的编辑（有些人认为不应该使用 $2）实际上打破了答案。尽管持续的投票表明答案击中了关键点 - 在 replaceFirst(...) 中使用 $n 引用来重用捕获的值 - 编辑丢失了未修改文本的事实也需要被捕获，并在替换中使用，以便“仅组（不是所有模式）”。

这个问题以及这个答案与迭代无关。这是故意的 MRE。

Use $n (where n is a digit) to refer to captured subsequences in replaceFirst(...). I'm assuming you wanted to replace the first group with the literal string "number" and the second group with the value of the first group.

Pattern p = Pattern.compile("(\\d)(.*)(\\d)");
String input = "6 example input 4";
Matcher m = p.matcher(input);
if (m.find()) {
    // replace first number with "number" and second number with the first
    // the added group ("(.*)" which is $2) captures unmodified text to include it in the result
    String output = m.replaceFirst("number$2$1"); // "number example input 6"
}

Consider (\D+) for the second group instead of (.*). * is a greedy matcher, and will at first consume the last digit. The matcher will then have to backtrack when it realizes the final (\d) has nothing to match, before it can match to the final digit.

Edit

Years later, this still gets votes, and the comments and edits (which broke the answer) show there is still confusion on what the question meant. I've fixed it, and added the much needed example output.

The edits to the replacement (some thought $2 should not be used) actually broke the answer. Though the continued votes shows the answer hits the key point - Use $n references within replaceFirst(...) to reuse captured values - the edits lost the fact that unmodified text needs to be captured as well, and used in the replacement so that "only groups (not all pattern)".

The question, and thus this answer, is not concerned with iterating. This is intentionally an MRE.

回复收藏 0 原文

醉生梦死 2024-07-31 09:37:30

您可以使用 Matcher#start(组） 和 Matcher#end(group) 构建通用替换方法：

public static String replaceGroup(String regex, String source, int groupToReplace, String replacement) {
    return replaceGroup(regex, source, groupToReplace, 1, replacement);
}

public static String replaceGroup(String regex, String source, int groupToReplace, int groupOccurrence, String replacement) {
    Matcher m = Pattern.compile(regex).matcher(source);
    for (int i = 0; i < groupOccurrence; i++)
        if (!m.find()) return source; // pattern not met, may also throw an exception here
    return new StringBuilder(source).replace(m.start(groupToReplace), m.end(groupToReplace), replacement).toString();
}

public static void main(String[] args) {
    // replace with "%" what was matched by group 1 
    // input: aaa123ccc
    // output: %123ccc
    System.out.println(replaceGroup("([a-z]+)([0-9]+)([a-z]+)", "aaa123ccc", 1, "%"));

    // replace with "!!!" what was matched the 4th time by the group 2
    // input: a1b2c3d4e5
    // output: a1b2c3d!!!e5
    System.out.println(replaceGroup("([a-z])(\\d)", "a1b2c3d4e5", 2, 4, "!!!"));
}

查看在线演示。

You could use Matcher#start(group) and Matcher#end(group) to build a generic replacement method:

public static String replaceGroup(String regex, String source, int groupToReplace, String replacement) {
    return replaceGroup(regex, source, groupToReplace, 1, replacement);
}

public static String replaceGroup(String regex, String source, int groupToReplace, int groupOccurrence, String replacement) {
    Matcher m = Pattern.compile(regex).matcher(source);
    for (int i = 0; i < groupOccurrence; i++)
        if (!m.find()) return source; // pattern not met, may also throw an exception here
    return new StringBuilder(source).replace(m.start(groupToReplace), m.end(groupToReplace), replacement).toString();
}

public static void main(String[] args) {
    // replace with "%" what was matched by group 1 
    // input: aaa123ccc
    // output: %123ccc
    System.out.println(replaceGroup("([a-z]+)([0-9]+)([a-z]+)", "aaa123ccc", 1, "%"));

    // replace with "!!!" what was matched the 4th time by the group 2
    // input: a1b2c3d4e5
    // output: a1b2c3d!!!e5
    System.out.println(replaceGroup("([a-z])(\\d)", "a1b2c3d4e5", 2, 4, "!!!"));
}

Check online demo here.

回复收藏 0 原文

风吹过旳痕迹 2024-07-31 09:37:30

抱歉，死马当活马医，但没有人指出这一点有点奇怪 - “是的，你可以，但这与现实生活中使用捕获组的方式相反”。

如果您按照预期的方式使用正则表达式，解决方案就像这样简单：

"6 example input 4".replaceAll("(?:\\d)(.*)(?:\\d)", "number$11");

或者正如下面 shmosel 正确指出的那样，

"6 example input 4".replaceAll("\d(.*)\d", "number$11");

...因为在您的正则表达式中根本没有充分的理由对小数进行分组。

您通常不会在要丢弃的字符串部分上使用捕获组，而是在想要保留<的字符串部分上使用它们/em>.

如果您确实想要替换组，那么您可能需要的是模板引擎（例如 moustache、ejs、StringTemplate，...）。

顺便说一句，即使正则表达式中的非捕获组也只是在正则表达式引擎需要它们识别和跳过变量文本的情况下存在。例如，

(?:abc)*(capture me)(?:bcd)*

如果您的输入看起来像“abcabc捕获我bcdbcd”或“abc捕获我bcd”，甚至只是“捕获我”，则您需要它们。

或者换句话说：如果文本始终相同，并且您没有捕获它，则根本没有理由使用组。

Sorry to beat a dead horse, but it is kind-of weird that no-one pointed this out - "Yes you can, but this is the opposite of how you use capturing groups in real life".

If you use Regex the way it is meant to be used, the solution is as simple as this:

"6 example input 4".replaceAll("(?:\\d)(.*)(?:\\d)", "number$11");

Or as rightfully pointed out by shmosel below,

"6 example input 4".replaceAll("\d(.*)\d", "number$11");

...since in your regex there is no good reason to group the decimals at all.

You don't usually use capturing groups on the parts of the string you want to discard, you use them on the part of the string you want to keep.

If you really want groups that you want to replace, what you probably want instead is a templating engine (e.g. moustache, ejs, StringTemplate, ...).

As an aside for the curious, even non-capturing groups in regexes are just there for the case that the regex engine needs them to recognize and skip variable text. For example, in

(?:abc)*(capture me)(?:bcd)*

you need them if your input can look either like "abcabccapture mebcdbcd" or "abccapture mebcd" or even just "capture me".

Or to put it the other way around: if the text is always the same, and you don't capture it, there is no reason to use groups at all.

回复收藏 0 原文

未央 2024-07-31 09:37:30

替换输入中的密码字段：

{"_csrf":["9d90c85f-ac73-4b15-ad08-ebaa3fa4a005"],"originPassword":["uaas"],"newPassword":["uaas"],"confirmPassword":["uaas"]}



  private static final Pattern PATTERN = Pattern.compile(".*?password.*?\":\\[\"(.*?)\"\\](,\"|}$)", Pattern.CASE_INSENSITIVE);

  private static String replacePassword(String input, String replacement) {
    Matcher m = PATTERN.matcher(input);
    StringBuffer sb = new StringBuffer();
    while (m.find()) {
      Matcher m2 = PATTERN.matcher(m.group(0));
      if (m2.find()) {
        StringBuilder stringBuilder = new StringBuilder(m2.group(0));
        String result = stringBuilder.replace(m2.start(1), m2.end(1), replacement).toString();
        m.appendReplacement(sb, result);
      }
    }
    m.appendTail(sb);
    return sb.toString();
  }

  @Test
  public void test1() {
    String input = "{\"_csrf\":[\"9d90c85f-ac73-4b15-ad08-ebaa3fa4a005\"],\"originPassword\":[\"123\"],\"newPassword\":[\"456\"],\"confirmPassword\":[\"456\"]}";
    String expected = "{\"_csrf\":[\"9d90c85f-ac73-4b15-ad08-ebaa3fa4a005\"],\"originPassword\":[\"**\"],\"newPassword\":[\"**\"],\"confirmPassword\":[\"**\"]}";
    Assert.assertEquals(expected, replacePassword(input, "**"));
  }

replace the password fields from the input:

{"_csrf":["9d90c85f-ac73-4b15-ad08-ebaa3fa4a005"],"originPassword":["uaas"],"newPassword":["uaas"],"confirmPassword":["uaas"]}



  private static final Pattern PATTERN = Pattern.compile(".*?password.*?\":\\[\"(.*?)\"\\](,\"|}$)", Pattern.CASE_INSENSITIVE);

  private static String replacePassword(String input, String replacement) {
    Matcher m = PATTERN.matcher(input);
    StringBuffer sb = new StringBuffer();
    while (m.find()) {
      Matcher m2 = PATTERN.matcher(m.group(0));
      if (m2.find()) {
        StringBuilder stringBuilder = new StringBuilder(m2.group(0));
        String result = stringBuilder.replace(m2.start(1), m2.end(1), replacement).toString();
        m.appendReplacement(sb, result);
      }
    }
    m.appendTail(sb);
    return sb.toString();
  }

  @Test
  public void test1() {
    String input = "{\"_csrf\":[\"9d90c85f-ac73-4b15-ad08-ebaa3fa4a005\"],\"originPassword\":[\"123\"],\"newPassword\":[\"456\"],\"confirmPassword\":[\"456\"]}";
    String expected = "{\"_csrf\":[\"9d90c85f-ac73-4b15-ad08-ebaa3fa4a005\"],\"originPassword\":[\"**\"],\"newPassword\":[\"**\"],\"confirmPassword\":[\"**\"]}";
    Assert.assertEquals(expected, replacePassword(input, "**"));
  }

回复收藏 0 原文

无妨# 2024-07-31 09:37:30

您可以使用 matcher.start() 和 matcher.end() 方法来获取组位置。因此，使用此位置您可以轻松替换任何文本。

回复收藏 0 原文

笑咖 2024-07-31 09:37:30

这是一个不同的解决方案，它也允许在多场比赛中替换单个组。
它使用堆栈来反转执行顺序，因此可以安全地执行字符串操作。

private static void demo () {

    final String sourceString = "hello world!";

    final String regex = "(hello) (world)(!)";
    final Pattern pattern = Pattern.compile(regex);

    String result = replaceTextOfMatchGroup(sourceString, pattern, 2, world -> world.toUpperCase());
    System.out.println(result);  // output: hello WORLD!
}

public static String replaceTextOfMatchGroup(String sourceString, Pattern pattern, int groupToReplace, Function<String,String> replaceStrategy) {
    Stack<Integer> startPositions = new Stack<>();
    Stack<Integer> endPositions = new Stack<>();
    Matcher matcher = pattern.matcher(sourceString);

    while (matcher.find()) {
        startPositions.push(matcher.start(groupToReplace));
        endPositions.push(matcher.end(groupToReplace));
    }
    StringBuilder sb = new StringBuilder(sourceString);
    while (! startPositions.isEmpty()) {
        int start = startPositions.pop();
        int end = endPositions.pop();
        if (start >= 0 && end >= 0) {
            sb.replace(start, end, replaceStrategy.apply(sourceString.substring(start, end)));
        }
    }
    return sb.toString();       
}

Here is a different solution, that also allows the replacement of a single group in multiple matches.
It uses stacks to reverse the execution order, so the string operation can be safely executed.

private static void demo () {

    final String sourceString = "hello world!";

    final String regex = "(hello) (world)(!)";
    final Pattern pattern = Pattern.compile(regex);

    String result = replaceTextOfMatchGroup(sourceString, pattern, 2, world -> world.toUpperCase());
    System.out.println(result);  // output: hello WORLD!
}

public static String replaceTextOfMatchGroup(String sourceString, Pattern pattern, int groupToReplace, Function<String,String> replaceStrategy) {
    Stack<Integer> startPositions = new Stack<>();
    Stack<Integer> endPositions = new Stack<>();
    Matcher matcher = pattern.matcher(sourceString);

    while (matcher.find()) {
        startPositions.push(matcher.start(groupToReplace));
        endPositions.push(matcher.end(groupToReplace));
    }
    StringBuilder sb = new StringBuilder(sourceString);
    while (! startPositions.isEmpty()) {
        int start = startPositions.pop();
        int end = endPositions.pop();
        if (start >= 0 && end >= 0) {
            sb.replace(start, end, replaceStrategy.apply(sourceString.substring(start, end)));
        }
    }
    return sb.toString();       
}

回复收藏 0 原文

踏雪无痕 2024-07-31 09:37:30

从 Java 9 开始，您可以使用 Matcher.replaceAll。
用法如下：

Pattern p = Pattern.compile("(\\d)(.*)(\\d)");
String input = "6 example input 4";
Matcher matcher = p.matcher(input);
String output = matcher.replaceAll(matchResult -> "%s%s%s".formatted("number", matchResult.group(2), matchResult.group(1) ));

output 应等于 number example input 6

matchResult.group(0) 是整个模式，因此组是从 1 索引

Since Java 9 you can use Matcher.replaceAll.
The usage is as follows:

Pattern p = Pattern.compile("(\\d)(.*)(\\d)");
String input = "6 example input 4";
Matcher matcher = p.matcher(input);
String output = matcher.replaceAll(matchResult -> "%s%s%s".formatted("number", matchResult.group(2), matchResult.group(1) ));

output should be equal to number example input 6

matchResult.group(0) is the whole pattern, so groups are indexed from 1

回复收藏 0 原文

~没有更多了~