一次替换多个子字符串

发布于 2024-12-08 10:36:52 字数 290 浏览 0 评论 0原文

假设我有一个文件,其中包含一些文本。其中有“substr1”、“substr2”、“substr3”等子字符串。我需要将所有这些子字符串替换为其他文本,例如“repl1”、“repl2”、“repl3”。在Python中,我会创建一个像这样的字典:

{
 "substr1": "repl1",
 "substr2": "repl2",
 "substr3": "repl3"
}

并创建用“|”连接键的模式,然后用re.sub函数替换。 Java 中有类似的简单方法吗?

Say I have a file, that contains some text. There are substrings like "substr1", "substr2", "substr3" etc. in it. I need to replace all of those substrings with some other text, like "repl1", "repl2", "repl3". In Python, I would create a dictionary like this:

{
 "substr1": "repl1",
 "substr2": "repl2",
 "substr3": "repl3"
}

and create the pattern joining the keys with '|', then replace with re.sub function.
Is there a similar simple way to do this in Java?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

公布 2024-12-15 10:36:52

这就是您的 Python 建议转换为 Java 的方式:

Map<String, String> replacements = new HashMap<String, String>() {{
    put("substr1", "repl1");
    put("substr2", "repl2");
    put("substr3", "repl3");
}};

String input = "lorem substr1 ipsum substr2 dolor substr3 amet";

// create the pattern joining the keys with '|'
String regexp = "substr1|substr2|substr3";

StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(input);

while (m.find())
    m.appendReplacement(sb, replacements.get(m.group()));
m.appendTail(sb);


System.out.println(sb.toString());   // lorem repl1 ipsum repl2 dolor repl3 amet

此方法执行同时(即“一次”)替换。即,如果您碰巧有

"a" -> "b"
"b" -> "c"

,那么这种方法将给出 "a b" -> “b c” 而不是建议您应该链接多个调用 replacereplaceAll 的答案,后者会给出 “c c”


(如果您将此方法推广为以编程方式创建正则表达式,请确保您 Pattern.quote 每个单独的搜索词和 Matcher.quoteReplacement 每个替换词。)

This is how your Python-suggestion translates to Java:

Map<String, String> replacements = new HashMap<String, String>() {{
    put("substr1", "repl1");
    put("substr2", "repl2");
    put("substr3", "repl3");
}};

String input = "lorem substr1 ipsum substr2 dolor substr3 amet";

// create the pattern joining the keys with '|'
String regexp = "substr1|substr2|substr3";

StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(input);

while (m.find())
    m.appendReplacement(sb, replacements.get(m.group()));
m.appendTail(sb);


System.out.println(sb.toString());   // lorem repl1 ipsum repl2 dolor repl3 amet

This approach does a simultanious (i.e. "at once") replacement. I.e., if you happened to have

"a" -> "b"
"b" -> "c"

then this approach would give "a b" -> "b c" as opposed to the answers suggesting you should chain several calls to replace or replaceAll which would give "c c".


(If you generalize this approach to create the regexp programatically, make sure you Pattern.quote each individual search word and Matcher.quoteReplacement each replacement word.)

撩动你心 2024-12-15 10:36:52
yourString.replace("substr1", "repl1")
          .replace("substr2", "repl2")
          .replace("substr3", "repl3");
yourString.replace("substr1", "repl1")
          .replace("substr2", "repl2")
          .replace("substr3", "repl3");
流绪微梦 2024-12-15 10:36:52

首先,演示一下问题:

String s = "I have three cats and two dogs.";
s = s.replace("cats", "dogs")
    .replace("dogs", "budgies");
System.out.println(s);

这是为了替换cats =>狗和狗=> budgies,但是顺序替换对前一个替换的结果进行操作,因此不幸的输出是:

我有三只虎皮鹦鹉和两只虎皮鹦鹉。

这是我的同时替换方法的实现。使用 String.regionMatches

public static String simultaneousReplace(String subject, String... pairs) {
    if (pairs.length % 2 != 0) throw new IllegalArgumentException(
        "Strings to find and replace are not paired.");
    StringBuilder sb = new StringBuilder();
    int numPairs = pairs.length / 2;
    outer:
    for (int i = 0; i < subject.length(); i++) {
        for (int j = 0; j < numPairs; j++) {
            String find = pairs[j * 2];
            if (subject.regionMatches(i, find, 0, find.length())) {
                sb.append(pairs[j * 2 + 1]);
                i += find.length() - 1;
                continue outer;
            }
        }
        sb.append(subject.charAt(i));
    }
    return sb.toString();
}

测试:

String s = "I have three cats and two dogs.";
s = simultaneousReplace(s,
    "cats", "dogs",
    "dogs", "budgies");
System.out.println(s);

输出:

我有三只狗和两只虎皮鹦鹉。

此外,在进行同时替换时有时很有用,以确保寻找最长的匹配。 (PHP 的 strtr 函数就是这样做的,对于示例。)这是我的实现:

public static String simultaneousReplaceLongest(String subject, String... pairs) {
    if (pairs.length % 2 != 0) throw new IllegalArgumentException(
        "Strings to find and replace are not paired.");
    StringBuilder sb = new StringBuilder();
    int numPairs = pairs.length / 2;
    for (int i = 0; i < subject.length(); i++) {
        int longestMatchIndex = -1;
        int longestMatchLength = -1;
        for (int j = 0; j < numPairs; j++) {
            String find = pairs[j * 2];
            if (subject.regionMatches(i, find, 0, find.length())) {
                if (find.length() > longestMatchLength) {
                    longestMatchIndex = j;
                    longestMatchLength = find.length();
                }
            }
        }
        if (longestMatchIndex >= 0) {
            sb.append(pairs[longestMatchIndex * 2 + 1]);
            i += longestMatchLength - 1;
        } else {
            sb.append(subject.charAt(i));
        }
    }
    return sb.toString();
}

为什么需要这个?示例如下:

String truth = "Java is to JavaScript";
truth += " as " + simultaneousReplaceLongest(truth,
    "Java", "Ham",
    "JavaScript", "Hamster");
System.out.println(truth);

输出:

Java 之于 JavaScript 就像 Ham 之于 Hamster

如果我们使用 simultaneousReplace 而不是 simultaneousReplaceLongest,输出将是“HamScript”而不是“Hamster”:)

请注意上述方法区分大小写。如果您需要不区分大小写的版本,则可以轻松修改上述内容,因为 String.regionMatches 可以采用 ignoreCase 参数。

First, a demonstration of the problem:

String s = "I have three cats and two dogs.";
s = s.replace("cats", "dogs")
    .replace("dogs", "budgies");
System.out.println(s);

This is intended to replace cats => dogs and dogs => budgies, but the sequential replacement operates on the result of the previous replacement, so the unfortunate output is:

I have three budgies and two budgies.

Here's my implementation of a simultaneous replacement method. It's easy to write using String.regionMatches:

public static String simultaneousReplace(String subject, String... pairs) {
    if (pairs.length % 2 != 0) throw new IllegalArgumentException(
        "Strings to find and replace are not paired.");
    StringBuilder sb = new StringBuilder();
    int numPairs = pairs.length / 2;
    outer:
    for (int i = 0; i < subject.length(); i++) {
        for (int j = 0; j < numPairs; j++) {
            String find = pairs[j * 2];
            if (subject.regionMatches(i, find, 0, find.length())) {
                sb.append(pairs[j * 2 + 1]);
                i += find.length() - 1;
                continue outer;
            }
        }
        sb.append(subject.charAt(i));
    }
    return sb.toString();
}

Testing:

String s = "I have three cats and two dogs.";
s = simultaneousReplace(s,
    "cats", "dogs",
    "dogs", "budgies");
System.out.println(s);

Output:

I have three dogs and two budgies.

Additionally, it is sometimes useful when doing simultaneous replacement, to make sure to look for the longest match. (PHP's strtr function does this, for example.) Here is my implementation for that:

public static String simultaneousReplaceLongest(String subject, String... pairs) {
    if (pairs.length % 2 != 0) throw new IllegalArgumentException(
        "Strings to find and replace are not paired.");
    StringBuilder sb = new StringBuilder();
    int numPairs = pairs.length / 2;
    for (int i = 0; i < subject.length(); i++) {
        int longestMatchIndex = -1;
        int longestMatchLength = -1;
        for (int j = 0; j < numPairs; j++) {
            String find = pairs[j * 2];
            if (subject.regionMatches(i, find, 0, find.length())) {
                if (find.length() > longestMatchLength) {
                    longestMatchIndex = j;
                    longestMatchLength = find.length();
                }
            }
        }
        if (longestMatchIndex >= 0) {
            sb.append(pairs[longestMatchIndex * 2 + 1]);
            i += longestMatchLength - 1;
        } else {
            sb.append(subject.charAt(i));
        }
    }
    return sb.toString();
}

Why would you need this? Example follows:

String truth = "Java is to JavaScript";
truth += " as " + simultaneousReplaceLongest(truth,
    "Java", "Ham",
    "JavaScript", "Hamster");
System.out.println(truth);

Output:

Java is to JavaScript as Ham is to Hamster

If we had used simultaneousReplace instead of simultaneousReplaceLongest, the output would have had "HamScript" instead of "Hamster" :)

Note that the above methods are case-sensitive. If you need case-insensitive versions it is easy to modify the above because String.regionMatches can take an ignoreCase parameter.

少女的英雄梦 2024-12-15 10:36:52
    return yourString.replaceAll("substr1","relp1").
                     replaceAll("substr2","relp2").
                     replaceAll("substr3","relp3")
    return yourString.replaceAll("substr1","relp1").
                     replaceAll("substr2","relp2").
                     replaceAll("substr3","relp3")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文