当替换文本与搜索文本重叠时替换 Java 中的多个子字符串

发布于 2024-12-06 08:03:20 字数 761 浏览 0 评论 0原文

假设您有以下字符串:

cat dog fish dog fish cat

您想要将所有 cats 替换为 dogs,将所有 dogs 替换为 fish,并且所有。直观上,预期结果:

dog fish cat fish cat dog

如果您尝试明显的解决方案,使用 replaceAll() 循环,您将得到:

  1. (原始) catdogfishdogfishcat
  2. (cat -> ; 狗)狗狗鱼狗鱼狗
  3. (狗->鱼)鱼鱼鱼鱼鱼鱼
  4. (鱼->猫)猫猫猫猫cat cat

显然,这是不是预期的结果。那么最简单的方法是什么?我可以将一些东西与 PatternMatcher (以及很多 Pattern.quote()Matcher.quoteReplacement()< /code>),但我拒绝相信我是第一个遇到这个问题的人,并且没有库函数可以解决它。

(FWIW,实际情况有点复杂,并且不涉及直接交换。)

Say you have the following string:

cat dog fish dog fish cat

You want to replace all cats with dogs, all dogs with fish, and all fish with cats. Intuitively, the expected result:

dog fish cat fish cat dog

If you try the obvious solution, looping through with replaceAll(), you get:

  1. (original) cat dog fish dog fish cat
  2. (cat -> dog) dog dog fish dog fish dog
  3. (dog -> fish) fish fish fish fish fish fish
  4. (fish -> cat) cat cat cat cat cat cat

Clearly, this is not the intended result. So what's the simplest way to do this? I can cobble something together with Pattern and Matcher (and a lot of Pattern.quote() and Matcher.quoteReplacement()), but I refuse to believe I'm the first person to have this problem and there's no library function to solve it.

(FWIW, the actual case is a bit more complicated and doesn't involve straight swaps.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

等风来 2024-12-13 08:03:20

似乎 StringUtils.replaceEach 可以满足您的要求:

StringUtils.replaceEach("abcdeab", new String[]{"ab", "cd"}, new String[]{"cd", "ab"});
// returns "cdabecd"

请注意,文档上面的链接似乎有错误。详情请参阅下面的评论。

It seems StringUtils.replaceEach in apache commons does what you want:

StringUtils.replaceEach("abcdeab", new String[]{"ab", "cd"}, new String[]{"cd", "ab"});
// returns "cdabecd"

Note that the documenent at the above links seems to be in error. See comments below for details.

String rep = str.replace("cat","§1§").replace("dog","§2§")
                .replace("fish","§3§").replace("§1§","dog")
                .replace("§2§","fish").replace("§3§","cat");

丑陋且低效,但有效。


好的,这是一个更详细、更通用的版本。我更喜欢使用正则表达式而不是扫描仪。这样我就可以替换任意字符串,而不仅仅是单词(这可能更好或更差)。无论如何,这里是:

public static String replace(
    final String input, final Map<String, String> replacements) {

    if (input == null || "".equals(input) || replacements == null 
        || replacements.isEmpty()) {
        return input;
    }
    StringBuilder regexBuilder = new StringBuilder();
    Iterator<String> it = replacements.keySet().iterator();
    regexBuilder.append(Pattern.quote(it.next()));
    while (it.hasNext()) {
        regexBuilder.append('|').append(Pattern.quote(it.next()));
    }
    Matcher matcher = Pattern.compile(regexBuilder.toString()).matcher(input);
    StringBuffer out = new StringBuffer(input.length() + (input.length() / 10));
    while (matcher.find()) {
        matcher.appendReplacement(out, replacements.get(matcher.group()));
    }
    matcher.appendTail(out);
    return out.toString();
}

测试代码:

System.out.println(replace("cat dog fish dog fish cat",
    ImmutableMap.of("cat", "dog", "dog", "fish", "fish", "cat")));

输出:

狗鱼猫鱼猫狗

显然这个解决方案只对许多替代品有意义,否则这是一个巨大的杀伤力。

String rep = str.replace("cat","§1§").replace("dog","§2§")
                .replace("fish","§3§").replace("§1§","dog")
                .replace("§2§","fish").replace("§3§","cat");

Ugly and inefficient as hell, but works.


OK, here's a more elaborate and generic version. I prefer using a regular expression rather than a scanner. That way I can replace arbitrary Strings, not just words (which can be better or worse). Anyway, here goes:

public static String replace(
    final String input, final Map<String, String> replacements) {

    if (input == null || "".equals(input) || replacements == null 
        || replacements.isEmpty()) {
        return input;
    }
    StringBuilder regexBuilder = new StringBuilder();
    Iterator<String> it = replacements.keySet().iterator();
    regexBuilder.append(Pattern.quote(it.next()));
    while (it.hasNext()) {
        regexBuilder.append('|').append(Pattern.quote(it.next()));
    }
    Matcher matcher = Pattern.compile(regexBuilder.toString()).matcher(input);
    StringBuffer out = new StringBuffer(input.length() + (input.length() / 10));
    while (matcher.find()) {
        matcher.appendReplacement(out, replacements.get(matcher.group()));
    }
    matcher.appendTail(out);
    return out.toString();
}

Test Code:

System.out.println(replace("cat dog fish dog fish cat",
    ImmutableMap.of("cat", "dog", "dog", "fish", "fish", "cat")));

Output:

dog fish cat fish cat dog

Obviously this solution only makes sense for many replacements, otherwise it's a huge overkill.

回首观望 2024-12-13 08:03:20

我会创建一个 StringBuilder,然后解析文本一次,一次一个单词,同时传输未更改的单词或已更改的单词。我不会按照您的建议为每次交换解析它。

因此,与其做类似的事情:

// pseudocode
text is new text swapping cat with dog
text is new text swapping dog with fish
text is new text swapping fish with cat

for each word in text
   if word is cat, swap with dog
   if word is dog, swap with fish
   if word is fish, swap with cat
   transfer new word (or unchanged word) into StringBuilder.

可能会为此创建一个 swap(...) 方法并使用 HashMap 进行交换。

例如

import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;

public class SwapWords {
   private static Map<String, String> myMap = new HashMap<String, String>();

   public static void main(String[] args) {
      // this would really be loaded using a file such as a text file or xml
      // or even a database:
      myMap.put("cat", "dog");
      myMap.put("dog", "fish");
      myMap.put("fish", "dog");

      String testString = "cat dog fish dog fish cat";

      StringBuilder sb = new StringBuilder();
      Scanner testScanner = new Scanner(testString);
      while (testScanner.hasNext()) {
         String text = testScanner.next();
         text = myMap.get(text) == null ? text : myMap.get(text);
         sb.append(text + " ");
      }

      System.out.println(sb.toString().trim());
   }
}

I would create a StringBuilder and then parse the text once, one word at a time, transferring over unchanged words or changed words as I go. I wouldn't parse it for each swap as you're suggesting.

So rather than doing something like:

// pseudocode
text is new text swapping cat with dog
text is new text swapping dog with fish
text is new text swapping fish with cat

I'd do

for each word in text
   if word is cat, swap with dog
   if word is dog, swap with fish
   if word is fish, swap with cat
   transfer new word (or unchanged word) into StringBuilder.

I'd probably make a swap(...) method for this and use a HashMap for the swap.

For example

import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;

public class SwapWords {
   private static Map<String, String> myMap = new HashMap<String, String>();

   public static void main(String[] args) {
      // this would really be loaded using a file such as a text file or xml
      // or even a database:
      myMap.put("cat", "dog");
      myMap.put("dog", "fish");
      myMap.put("fish", "dog");

      String testString = "cat dog fish dog fish cat";

      StringBuilder sb = new StringBuilder();
      Scanner testScanner = new Scanner(testString);
      while (testScanner.hasNext()) {
         String text = testScanner.next();
         text = myMap.get(text) == null ? text : myMap.get(text);
         sb.append(text + " ");
      }

      System.out.println(sb.toString().trim());
   }
}
三寸金莲 2024-12-13 08:03:20
public class myreplase {
    public Map<String, String> replase;

    public myreplase() {
        replase = new HashMap<String, String>();

        replase.put("a", "Apple");
        replase.put("b", "Banana");
        replase.put("c", "Cantalope");
        replase.put("d", "Date");
        String word = "a b c d a b c d";

        String ss = "";
        Iterator<String> i = replase.keySet().iterator();
        while (i.hasNext()) {
            ss += i.next();
            if (i.hasNext()) {
                ss += "|";
            }
        }

        Pattern pattern = Pattern.compile(ss);
        StringBuilder buffer = new StringBuilder();
        for (int j = 0, k = 1; j < word.length(); j++,k++) {
            String s = word.substring(j, k);
            Matcher matcher = pattern.matcher(s);
            if (matcher.find()) {
                buffer.append(replase.get(s));
            } else {
                buffer.append(s);
            }
        }
        System.out.println(buffer.toString());
    }

    public static void main(String[] args) {
        new myreplase();
    }
}

输出 :-
苹果香蕉哈密瓜枣 苹果香蕉哈密瓜枣

public class myreplase {
    public Map<String, String> replase;

    public myreplase() {
        replase = new HashMap<String, String>();

        replase.put("a", "Apple");
        replase.put("b", "Banana");
        replase.put("c", "Cantalope");
        replase.put("d", "Date");
        String word = "a b c d a b c d";

        String ss = "";
        Iterator<String> i = replase.keySet().iterator();
        while (i.hasNext()) {
            ss += i.next();
            if (i.hasNext()) {
                ss += "|";
            }
        }

        Pattern pattern = Pattern.compile(ss);
        StringBuilder buffer = new StringBuilder();
        for (int j = 0, k = 1; j < word.length(); j++,k++) {
            String s = word.substring(j, k);
            Matcher matcher = pattern.matcher(s);
            if (matcher.find()) {
                buffer.append(replase.get(s));
            } else {
                buffer.append(s);
            }
        }
        System.out.println(buffer.toString());
    }

    public static void main(String[] args) {
        new myreplase();
    }
}

Output :-
Apple Banana Cantalope Date Apple Banana Cantalope Date

佞臣 2024-12-13 08:03:20

这是一种无需正则表达式即可完成此操作的方法。

我注意到,每次字符串 a 的一部分被 b 替换时,b 将始终是最终字符串的一部分。因此,从那时起您可以忽略字符串中的 b

不仅如此,将a替换为b后,那里还会留下一个“空格”。无法在 b 应该所在的位置进行替换。

这些操作加起来看起来很像split拆分值(在字符串之间添加“空格”),对数组中的每个字符串进行进一步替换,然后将它们连接回来。

例如:

// Original
"cat dog fish dog fish cat"

// Replace cat with dog
{"", "dog fish dog fish", ""}.join("dog")

// Replace dog with fish
{
    "",
    {"", " fish ", " fish"}.join("fish")
    ""
}.join("dog")

// Replace fish with cat
{
    "",
    {
        "",
        {" ", " "}.join("cat"),
        {" ", ""}.join("cat")
    }.join("fish")
    ""
}.join("dog")

到目前为止,最直观的方法(对我来说)是递归地执行此操作:

public static String replaceWithJointMap(String s, Map<String, String> map) {
    // Base case
    if (map.size() == 0) {
        return s;
    }

    // Get some value in the map to replace
    Map.Entry pair = map.entrySet().iterator().next();
    String replaceFrom = (String) pair.getKey();
    String replaceTo = (String) pair.getValue();

    // Split the current string with the replaceFrom string
    // Use split with -1 so that trailing empty strings are included
    String[] splitString = s.split(Pattern.quote(replaceFrom), -1);

    // Apply replacements for each of the strings in the splitString
    HashMap<String, String> replacementsLeft = new HashMap<>(map);
    replacementsLeft.remove(replaceFrom);

    for (int i=0; i<splitString.length; i++) {
        splitString[i] = replaceWithJointMap(splitString[i], replacementsLeft);
    }

    // Join back with the current replacements
    return String.join(replaceTo, splitString);
}

但我认为这不是很有效。

Here's a method to do it without regex.

I noticed that every time a part of the string a gets replaced with b, b will always be part of the final string. So, you can ignore b from the string from then on.

Not only that, after replacing a with b, there will be a "space" left there. No replacement can take place across where b is supposed to be.

These actions add up to look a lot like split. split up the values (making the "space" in between strings), do further replacements for each string in the array, then joins them back.

For example:

// Original
"cat dog fish dog fish cat"

// Replace cat with dog
{"", "dog fish dog fish", ""}.join("dog")

// Replace dog with fish
{
    "",
    {"", " fish ", " fish"}.join("fish")
    ""
}.join("dog")

// Replace fish with cat
{
    "",
    {
        "",
        {" ", " "}.join("cat"),
        {" ", ""}.join("cat")
    }.join("fish")
    ""
}.join("dog")

So far the most intuitive way (to me) is to do this is recursively:

public static String replaceWithJointMap(String s, Map<String, String> map) {
    // Base case
    if (map.size() == 0) {
        return s;
    }

    // Get some value in the map to replace
    Map.Entry pair = map.entrySet().iterator().next();
    String replaceFrom = (String) pair.getKey();
    String replaceTo = (String) pair.getValue();

    // Split the current string with the replaceFrom string
    // Use split with -1 so that trailing empty strings are included
    String[] splitString = s.split(Pattern.quote(replaceFrom), -1);

    // Apply replacements for each of the strings in the splitString
    HashMap<String, String> replacementsLeft = new HashMap<>(map);
    replacementsLeft.remove(replaceFrom);

    for (int i=0; i<splitString.length; i++) {
        splitString[i] = replaceWithJointMap(splitString[i], replacementsLeft);
    }

    // Join back with the current replacements
    return String.join(replaceTo, splitString);
}

I don't think this is very efficient though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文