当替换文本与搜索文本重叠时替换 Java 中的多个子字符串
假设您有以下字符串:
cat dog fish dog fish cat
您想要将所有 cats
替换为 dogs
,将所有 dogs
替换为 fish
,并且所有鱼
和猫
。直观上,预期结果:
dog fish cat fish cat dog
如果您尝试明显的解决方案,使用 replaceAll()
循环,您将得到:
- (原始)
catdogfishdogfishcat
- (cat -> ; 狗)
狗狗鱼狗鱼狗
- (狗->鱼)
鱼鱼鱼鱼鱼鱼
- (鱼->猫)
猫猫猫猫cat cat
显然,这是不是预期的结果。那么最简单的方法是什么?我可以将一些东西与 Pattern
和 Matcher
(以及很多 Pattern.quote()
和 Matcher.quoteReplacement()< /code>),但我拒绝相信我是第一个遇到这个问题的人,并且没有库函数可以解决它。
(FWIW,实际情况有点复杂,并且不涉及直接交换。)
Say you have the following string:
cat dog fish dog fish cat
You want to replace all cats
with dogs
, all dogs
with fish
, and all fish
with cats
. Intuitively, the expected result:
dog fish cat fish cat dog
If you try the obvious solution, looping through with replaceAll()
, you get:
- (original)
cat dog fish dog fish cat
- (cat -> dog)
dog dog fish dog fish dog
- (dog -> fish)
fish fish fish fish fish fish
- (fish -> cat)
cat cat cat cat cat cat
Clearly, this is not the intended result. So what's the simplest way to do this? I can cobble something together with Pattern
and Matcher
(and a lot of Pattern.quote()
and Matcher.quoteReplacement()
), but I refuse to believe I'm the first person to have this problem and there's no library function to solve it.
(FWIW, the actual case is a bit more complicated and doesn't involve straight swaps.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
似乎 StringUtils.replaceEach 可以满足您的要求:
请注意,文档上面的链接似乎有错误。详情请参阅下面的评论。
It seems StringUtils.replaceEach in apache commons does what you want:
Note that the documenent at the above links seems to be in error. See comments below for details.
丑陋且低效,但有效。
好的,这是一个更详细、更通用的版本。我更喜欢使用正则表达式而不是扫描仪。这样我就可以替换任意字符串,而不仅仅是单词(这可能更好或更差)。无论如何,这里是:
测试代码:
输出:
显然这个解决方案只对许多替代品有意义,否则这是一个巨大的杀伤力。
Ugly and inefficient as hell, but works.
OK, here's a more elaborate and generic version. I prefer using a regular expression rather than a scanner. That way I can replace arbitrary Strings, not just words (which can be better or worse). Anyway, here goes:
Test Code:
Output:
Obviously this solution only makes sense for many replacements, otherwise it's a huge overkill.
我会创建一个 StringBuilder,然后解析文本一次,一次一个单词,同时传输未更改的单词或已更改的单词。我不会按照您的建议为每次交换解析它。
因此,与其做类似的事情:
我
可能会为此创建一个 swap(...) 方法并使用 HashMap 进行交换。
例如
I would create a StringBuilder and then parse the text once, one word at a time, transferring over unchanged words or changed words as I go. I wouldn't parse it for each swap as you're suggesting.
So rather than doing something like:
I'd do
I'd probably make a swap(...) method for this and use a HashMap for the swap.
For example
输出 :-
苹果香蕉哈密瓜枣 苹果香蕉哈密瓜枣
Output :-
Apple Banana Cantalope Date Apple Banana Cantalope Date
这是一种无需正则表达式即可完成此操作的方法。
我注意到,每次字符串
a
的一部分被b
替换时,b
将始终是最终字符串的一部分。因此,从那时起您可以忽略字符串中的b
。不仅如此,将
a
替换为b
后,那里还会留下一个“空格”。无法在b
应该所在的位置进行替换。这些操作加起来看起来很像
split
。拆分
值(在字符串之间添加“空格”),对数组中的每个字符串进行进一步替换,然后将它们连接回来。例如:
到目前为止,最直观的方法(对我来说)是递归地执行此操作:
但我认为这不是很有效。
Here's a method to do it without regex.
I noticed that every time a part of the string
a
gets replaced withb
,b
will always be part of the final string. So, you can ignoreb
from the string from then on.Not only that, after replacing
a
withb
, there will be a "space" left there. No replacement can take place across whereb
is supposed to be.These actions add up to look a lot like
split
.split
up the values (making the "space" in between strings), do further replacements for each string in the array, then joins them back.For example:
So far the most intuitive way (to me) is to do this is recursively:
I don't think this is very efficient though.