StringreplaceAll() 与 MatcherreplaceAll() (性能差异)

发布于 2024-08-05 19:20:45 字数 149 浏览 5 评论 0原文

String.replaceAll() 和 Matcher.replaceAll() (在从 Regex.Pattern 创建的 Matcher 对象上)在性能方面是否存在已知差异?

另外,两者之间的高级 API 差异是什么? (不变性、处理 NULL、处理空字符串等)

Are there known difference(s) between String.replaceAll() and Matcher.replaceAll() (On a Matcher Object created from a Regex.Pattern) in terms of performance?

Also, what are the high-level API 'ish differences between the both? (Immutability, Handling NULLs, Handling empty strings, etc.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

南冥有猫 2024-08-12 19:20:46

根据 String.replaceAll,它有以下关于调用该方法的内容:

调用此方法
形式 str.replaceAll(regex, repl)
产生与以下完全相同的结果
表达方式

Pattern.compile(regex).matcher(str).replaceAll(repl)

,可以预期调用 String.replaceAll 和显式创建 匹配器Pattern 应该相同。

编辑

正如评论中所指出的,对于从 StringreplaceAll 进行单次调用,性能差异是不存在的。 Matcher,但是,如果需要对 replaceAll 执行多次调用,人们会期望保留已编译的 Pattern 是有益的,因此相对昂贵的正则表达式模式编译不必每次都执行。

According to the documentation for String.replaceAll, it has the following to say about calling the method:

An invocation of this method of the
form str.replaceAll(regex, repl)
yields exactly the same result as the
expression

Pattern.compile(regex).matcher(str).replaceAll(repl)

Therefore, it can be expected the performance between invoking the String.replaceAll, and explicitly creating a Matcher and Pattern should be the same.

Edit

As has been pointed out in the comments, the performance difference being non-existent would be true for a single call to replaceAll from String or Matcher, however, if one needs to perform multiple calls to replaceAll, one would expect it to be beneficial to hold onto a compiled Pattern, so the relatively expensive regular expression pattern compilation does not have to be performed every time.

樱花细雨 2024-08-12 19:20:46

String.replaceAll() 的源代码:

public String replaceAll(String regex, String replacement) {
    return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}

它必须首先编译该模式 - 如果您要在短字符串上使用相同的模式多次运行它,如果重用,性能会好得多一个已编译的模式。

Source code of String.replaceAll():

public String replaceAll(String regex, String replacement) {
    return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}

It has to compile the pattern first - if you're going to run it many times with the same pattern on short strings, performance will be much better if you reuse one compiled Pattern.

忆悲凉 2024-08-12 19:20:46

主要区别在于,如果您保留用于生成 MatcherPattern,则可以避免每次使用时重新编译正则表达式。通过 String,您无法像这样“缓存”。

如果每次都有不同的正则表达式,那么使用 String 类的 replaceAll 就可以了。如果您将相同的正则表达式应用于多个字符串,请创建一个Pattern并重复使用它。

The main difference is that if you hold onto the Pattern used to produce the Matcher, you can avoid recompiling the regex every time you use it. Going through String, you don't get the ability to "cache" like this.

If you have a different regex every time, using the String class's replaceAll is fine. If you are applying the same regex to many strings, create one Pattern and reuse it.

林空鹿饮溪 2024-08-12 19:20:46

不可变性/线程安全:编译后的模式是不可变的,而匹配器则不是。 (请参阅Java Regex 线程安全吗?

处理空字符串:replaceAll 应该处理空字符串优雅地(它不会匹配空的输入字符串模式)

煮咖啡等:最后我听说,String、Pattern 和 Matcher 都没有任何 API 功能。

编辑:至于处理 NULL,String 和 Pattern 的文档没有明确说明这一点,但我怀疑他们会抛出 NullPointerException 因为他们期望一个 String。

Immutability / thread safety: compiled Patterns are immutable, Matchers are not. (see Is Java Regex Thread Safe?)

Handling empty strings: replaceAll should handle empty strings gracefully (it won't match an empty input string pattern)

Making coffee, etc.: last I heard, neither String nor Pattern nor Matcher had any API features for that.

edit: as for handling NULLs, the documentation for String and Pattern doesn't explicitly say so, but I suspect they'd throw a NullPointerException since they expect a String.

起风了 2024-08-12 19:20:46

String.replaceAll 的实现告诉您需要知道的一切:(

return Pattern.compile(regex).matcher(this).replaceAll(replacement);

文档也说了同样的事情。)

虽然我没有检查缓存,但我当然希望编译一个模式< em>一次并保留对此的静态引用比每次使用相同的模式调用Pattern.compile更有效。如果有缓存,效率会得到小幅提升,如果没有,效率可能会大幅提升。

The implementation of String.replaceAll tells you everything you need to know:

return Pattern.compile(regex).matcher(this).replaceAll(replacement);

(And the docs say the same thing.)

While I haven't checked for caching, I'd certainly expect that compiling a pattern once and keeping a static reference to that would be more efficient than calling Pattern.compile with the same pattern each time. If there's a cache it'll be a small efficiency saving - if there isn't it could be a large one.

影子是时光的心 2024-08-12 19:20:46

不同之处在于 String.replaceAll() 每次调用时都会编译正则表达式。 .NET 的静态 Regex.Replace() 方法没有等效项,该方法会自动缓存已编译的正则表达式。通常,replaceAll() 只执行一次,但如果您要使用相同的正则表达式重复调用它,尤其是在循环中,则应该创建一个 Pattern 对象并使用 Matcher 方法。

您也可以提前创建 Matcher,并使用其 reset() 方法在每次使用时重新定位它:

Matcher m = Pattern.compile(regex).matcher("");
for (String s : targets)
{
  System.out.println(m.reset(s).replaceAll(repl));
}

当然,重用 Matcher 的性能优势远不如重用 Pattern 的性能优势大。

The difference is that String.replaceAll() compiles the regex each time it's called. There's no equivalent for .NET's static Regex.Replace() method, which automatically caches the compiled regex. Usually, replaceAll() is something you do only once, but if you're going to be calling it repeatedly with the same regex, especially in a loop, you should create a Pattern object and use the Matcher method.

You can create the Matcher ahead of time, too, and use its reset() method to retarget it for each use:

Matcher m = Pattern.compile(regex).matcher("");
for (String s : targets)
{
  System.out.println(m.reset(s).replaceAll(repl));
}

The performance benefit of reusing the Matcher, of course, is nowhere as great as that of reusing the Pattern.

笑脸一如从前 2024-08-12 19:20:46

其他答案足以涵盖OP的性能部分,但是 Matcher::replaceAllString::replaceAll 之间的另一个区别也是编译您自己的 的原因模式。当您自己编译Pattern时,可以使用标志等选项来修改正则表达式的应用方式。例如:

Pattern myPattern = Pattern.compile(myRegex, Pattern.CASE_INSENSITIVE);

Matcher 将应用您在调用 Matcher::replaceAll 时设置的所有标志。

您还可以设置其他标志。大多数情况下,我只是想指出 PatternMatcher API 有很多选项,这是超越简单 String::replaceAll< 的主要原因/代码>

The other answers sufficiently cover the performance part of the OP, but another difference between Matcher::replaceAll and String::replaceAll is also a reason to compile your own Pattern. When you compile a Pattern yourself, there are options like flags to modify how the regex is applied. For example:

Pattern myPattern = Pattern.compile(myRegex, Pattern.CASE_INSENSITIVE);

The Matcher will apply all the flags you set when you call Matcher::replaceAll.

There are other flags you can set as well. Mostly I just wanted to point out that the Pattern and Matcher API has lots of options, and that's the primary reason to go beyond the simple String::replaceAll

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文