StringBuilder.Append 与 StringBuilder.AppendFormat

发布于 2024-07-16 10:26:28 字数 436 浏览 6 评论 0原文

我想知道有关 StringBuilder 的问题,并且有一个问题希望社区能够解释。

让我们忘记代码的可读性,其中哪个更快,为什么?

StringBuilder.Append

StringBuilder sb = new StringBuilder();
sb.Append(string1);
sb.Append("----");
sb.Append(string2);

StringBuilder.AppendFormat

StringBuilder sb = new StringBuilder();
sb.AppendFormat("{0}----{1}",string1,string2);

I was wondering about StringBuilder and I've got a question that I was hoping the community would be able to explain.

Let's just forget about code readability, which of these is faster and why?

StringBuilder.Append:

StringBuilder sb = new StringBuilder();
sb.Append(string1);
sb.Append("----");
sb.Append(string2);

StringBuilder.AppendFormat:

StringBuilder sb = new StringBuilder();
sb.AppendFormat("{0}----{1}",string1,string2);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

拒绝两难 2024-07-23 10:26:28

如果不知道 string1string2 的大小,就无法确定。

通过调用 AppendFormat< /a>,它将在给定格式字符串和将插入的字符串的长度的情况下预分配缓冲区一次,然后连接所有内容并将其插入缓冲区。 对于非常大的字符串,这比单独调用 更有利Append 这可能会导致缓冲区扩展多次。

但是,对 Append 的三个调用可能会也可能不会触发缓冲区的增长,并且每次调用都会执行该检查。 如果字符串足够小并且不会触发缓冲区扩展,那么它将比调用 AppendFormat 更快,因为它不必解析格式字符串来确定在哪里进行替换。

需要更多数据才能得出明确的答案

应该注意的是,很少讨论使用静态String 类上的Concat 方法 (Jon 使用 AppendWithCapacity 的回答让我想起了这一点)。 他的测试结果表明这是最好的情况(假设您不必利用特定的格式说明符)。 String.Concat 执行相同的操作,因为它将预先确定要连接的字符串的长度并预分配缓冲区(由于通过参数进行循环构造,因此开销稍大)。 它的性能将与 Jon 的 AppendWithCapacity 方法相当。

或者,只是简单的加法运算符,因为无论如何它都会编译为对 String.Concat 的调用,但需要注意的是所有加法都在同一表达式中:

// One call to String.Concat.
string result = a + b + c;

NOT

// Two calls to String.Concat.
string result = a + b;
result = result + c;

<对于所有那些放置测试代码的人,

您需要在单独运行中运行测试用例(或者至少在单独测试运行的测量之间执行GC)。 这样做的原因是,如果您确实说运行 1,000,000 次,则会创建一个新的 StringBuilder 在一次测试的循环的每次迭代中,然后运行循环相同次数的下一个测试,创建额外 1,000,000 StringBuilder 实例中,GC 很可能会在第二次测试期间介入并阻碍其计时。

It's impossible to say, not knowing the size of string1 and string2.

With the call to AppendFormat, it will preallocate the buffer just once given the length of the format string and the strings that will be inserted and then concatenate everything and insert it into the buffer. For very large strings, this will be advantageous over separate calls to Append which might cause the buffer to expand multiple times.

However, the three calls to Append might or might not trigger growth of the buffer and that check is performed each call. If the strings are small enough and no buffer expansion is triggered, then it will be faster than the call to AppendFormat because it won't have to parse the format string to figure out where to do the replacements.

More data is needed for a definitive answer

It should be noted that there is little discussion of using the static Concat method on the String class (Jon's answer using AppendWithCapacity reminded me of this). His test results show that to be the best case (assuming you don't have to take advantage of specific format specifier). String.Concat does the same thing in that it will predetermine the length of the strings to concatenate and preallocate the buffer (with slightly more overhead due to looping constructs through the parameters). It's performance is going to be comparable to Jon's AppendWithCapacity method.

Or, just the plain addition operator, since it compiles to a call to String.Concat anyways, with the caveat that all of the additions are in the same expression:

// One call to String.Concat.
string result = a + b + c;

NOT

// Two calls to String.Concat.
string result = a + b;
result = result + c;

For all those putting up test code

You need to run your test cases in separate runs (or at the least, perform a GC between the measuring of separate test runs). The reason for this is that if you do say, 1,000,000 runs, creating a new StringBuilder in each iteration of the loop for one test, and then you run the next test that loops the same number of times, creating an additional 1,000,000 StringBuilder instances, the GC will more than likely step in during the second test and hinder its timing.

战皆罪 2024-07-23 10:26:28

casperOne 是正确的。 一旦达到某个阈值,Append() 方法就会比 AppendFormat() 慢。 以下是每种方法 100,000 次迭代的不同长度和经过的刻度:

长度:1

Append()       - 50900
AppendFormat() - 126826

长度:1000

Append()       - 1241938
AppendFormat() - 1337396

长度:10,000

Append()       - 12482051
AppendFormat() - 12740862

长度:20,000

Append()       - 61029875
AppendFormat() - 60483914

当引入长度接近 20,000 的字符串时,AppendFormat() 函数将稍微优于Append()

为什么会出现这种情况? 请参阅 casperOne 的回答

编辑:

我在发布配置下单独重新运行每个测试并更新了结果。

casperOne is correct. Once you reach a certain threshold, the Append() method becomes slower than AppendFormat(). Here are the different lengths and elapsed ticks of 100,000 iterations of each method:

Length: 1

Append()       - 50900
AppendFormat() - 126826

Length: 1000

Append()       - 1241938
AppendFormat() - 1337396

Length: 10,000

Append()       - 12482051
AppendFormat() - 12740862

Length: 20,000

Append()       - 61029875
AppendFormat() - 60483914

When strings with a length near 20,000 are introduced, the AppendFormat() function will slightly outperform Append().

Why does this happen? See casperOne's answer.

Edit:

I reran each test individually under Release configuration and updated the results.

顾冷 2024-07-23 10:26:28

casperOne 完全准确,它取决于数据。 但是,假设您将其编写为供第三方使用的类库 - 您会使用哪个?

一种选择是两全其美 - 计算出您实际上需要附加多少数据,然后使用 StringBuilder.EnsureCapacity 以确保我们只需要一次缓冲区大小调整。

如果我不太担心,我会使用 Append x3 - 它似乎“更有可能”更快,因为在每次调用时解析字符串格式令牌是显然是在做工作。

请注意,我已要求 BCL 团队提供一种“缓存格式化程序”,我们可以使用格式字符串创建它,然后重复使用。 框架每次使用时都必须解析格式字符串,这太疯狂了。

编辑:好的,为了灵活性,我对约翰的代码进行了一些编辑,并添加了一个“AppendWithCapacity”,它首先计算出必要的容量。 以下是不同长度的结果 - 对于长度 1,我使用了 1,000,000 次迭代; 对于所有其他长度,我使用了 100,000。 (这只是为了获得合理的运行时间。)所有时间均以毫秒为单位。

不幸的是,表格在 SO 中并不能真正工作。 长度为 1, 1000, 10000, 20000

次:

  • Append: 162, 475, 7997, 17970
  • AppendFormat: 392, 499, 8541, 18993
  • AppendWithCapacity: 139, 189, 1558, 3085

所以当它发生时,我从未见过 AppendFormat 击败 Append - 但我确实看到 AppendWithCapacity 以非常大的优势获胜。

这是完整的代码:

using System;
using System.Diagnostics;
using System.Text;

public class StringBuilderTest
{            
    static void Append(string string1, string string2)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append(string1);
        sb.Append("----");
        sb.Append(string2);
    }

    static void AppendWithCapacity(string string1, string string2)
    {
        int capacity = string1.Length + string2.Length + 4;
        StringBuilder sb = new StringBuilder(capacity);
        sb.Append(string1);
        sb.Append("----");
        sb.Append(string2);
    }

    static void AppendFormat(string string1, string string2)
    {
        StringBuilder sb = new StringBuilder();
        sb.AppendFormat("{0}----{1}", string1, string2);
    }

    static void Main(string[] args)
    {
        int size = int.Parse(args[0]);
        int iterations = int.Parse(args[1]);
        string method = args[2];

        Action<string,string> action;
        switch (method)
        {
            case "Append": action = Append; break;
            case "AppendWithCapacity": action = AppendWithCapacity; break;
            case "AppendFormat": action = AppendFormat; break;
            default: throw new ArgumentException();
        }

        string string1 = new string('x', size);
        string string2 = new string('y', size);

        // Make sure it's JITted
        action(string1, string2);
        GC.Collect();

        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < iterations; i++)
        {
            action(string1, string2);
        }
        sw.Stop();
        Console.WriteLine("Time: {0}ms", (int) sw.ElapsedMilliseconds);
    }
}

casperOne is entirely accurate that it depends on the data. However, suppose you're writing this as a class library for 3rd parties to consume - which would you use?

One option would be to get the best of both worlds - work out how much data you're actually going to have to append, and then use StringBuilder.EnsureCapacity to make sure we only need a single buffer resize.

If I weren't too bothered though, I'd use Append x3 - it seems "more likely" to be faster, as parsing the string format tokens on every call is clearly make-work.

Note that I've asked the BCL team for a sort of "cached formatter" which we could create using a format string and then re-use repeatedly. It's crazy that the framework has to parse the format string each time it's used.

EDIT: Okay, I've edited John's code somewhat for flexibility and added an "AppendWithCapacity" which just works out the necessary capacity first. Here are the results for the different lengths - for length 1 I used 1,000,000 iterations; for all other lengths I used 100,000. (This was just to get sensible running times.) All times are in millis.

Unfortunately tables don't really work in SO. The lengths were 1, 1000, 10000, 20000

Times:

  • Append: 162, 475, 7997, 17970
  • AppendFormat: 392, 499, 8541, 18993
  • AppendWithCapacity: 139, 189, 1558, 3085

So as it happened, I never saw AppendFormat beat Append - but I did see AppendWithCapacity win by a very substantial margin.

Here's the full code:

using System;
using System.Diagnostics;
using System.Text;

public class StringBuilderTest
{            
    static void Append(string string1, string string2)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append(string1);
        sb.Append("----");
        sb.Append(string2);
    }

    static void AppendWithCapacity(string string1, string string2)
    {
        int capacity = string1.Length + string2.Length + 4;
        StringBuilder sb = new StringBuilder(capacity);
        sb.Append(string1);
        sb.Append("----");
        sb.Append(string2);
    }

    static void AppendFormat(string string1, string string2)
    {
        StringBuilder sb = new StringBuilder();
        sb.AppendFormat("{0}----{1}", string1, string2);
    }

    static void Main(string[] args)
    {
        int size = int.Parse(args[0]);
        int iterations = int.Parse(args[1]);
        string method = args[2];

        Action<string,string> action;
        switch (method)
        {
            case "Append": action = Append; break;
            case "AppendWithCapacity": action = AppendWithCapacity; break;
            case "AppendFormat": action = AppendFormat; break;
            default: throw new ArgumentException();
        }

        string string1 = new string('x', size);
        string string2 = new string('y', size);

        // Make sure it's JITted
        action(string1, string2);
        GC.Collect();

        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < iterations; i++)
        {
            action(string1, string2);
        }
        sw.Stop();
        Console.WriteLine("Time: {0}ms", (int) sw.ElapsedMilliseconds);
    }
}
泅渡 2024-07-23 10:26:28

追加在大多数情况下会更快,因为该方法有许多重载,允许编译器调用正确的方法。 由于您使用的是StringsStringBuilder 可以使用String 重载进行Append

AppendFormat接受一个String,然后是一个Object[],这意味着必须解析格式,并且必须将数组中的每个Object ToString'd,然后才能将其添加到 StringBuilder 的 内部数组中。

注意:对于 casperOne 的观点 - 如果没有更多数据,很难给出准确的答案。

Append will be faster in most cases because there are many overloads to that method that allow the compiler to call the correct method. Since you are using Strings the StringBuilder can use the String overload for Append.

AppendFormat takes a String and then an Object[] which means that the format will have to be parsed and each Object in the array will have to be ToString'd before it can be added to the StringBuilder's internal array.

Note: To casperOne's point - it is difficult to give an exact answer without more data.

深巷少女 2024-07-23 10:26:28

StringBuilder 还具有级联追加:Append() 返回 StringBuilder 本身,因此您可以像这样编写代码:

StringBuilder sb = new StringBuilder();
sb.Append(string1)
  .Append("----")
  .Append(string2);

干净,并且生成的代码更少IL 代码(尽管这实际上是一个微观优化)。

StringBuilder also has cascaded appends: Append() returns the StringBuilder itself, so you can write your code like this:

StringBuilder sb = new StringBuilder();
sb.Append(string1)
  .Append("----")
  .Append(string2);

Clean, and it generates less IL-code (although that's really a micro-optimization).

梦回旧景 2024-07-23 10:26:28

当然,要确定每种情况下的概况。

也就是说,我认为一般来说会是前者,因为您没有重复解析格式字符串。

然而,差异会非常小。 无论如何,在大多数情况下您确实应该考虑使用 AppendFormat

Of course profile to know for sure in each case.

That said, I think in general it will be the former because you aren't repeatedly parsing the format string.

However, the difference would be very small. To the point that you really should consider using AppendFormat in most cases anyway.

掌心的温暖 2024-07-23 10:26:28

我认为这是完成最少工作量的调用。 Append 只是连接字符串,而 AppendFormat 则进行字符串替换。 当然,现在你永远无法分辨...

I'd assume it was the call that did the least amount of work. Append just concatenates strings, where AppendFormat is doing string substitutions. Of course these days, you never can tell...

只有影子陪我不离不弃 2024-07-23 10:26:28

1 应该更快,因为它只是附加字符串,而 2 必须根据格式创建一个字符串,然后附加该字符串。 所以这里还有一个额外的步骤。

1 should be faster becuase it's simply appending the strings whereas 2 has to create a string based on a format and then append the string. So there's an extra step in there.

娇纵 2024-07-23 10:26:28

在你的情况下更快是 1 但这不是一个公平的比较。 您应该询问 StringBuilder.AppendFormat()StringBuilder.Append(string.Format()) - 由于内部使用 char 数组,第一个更快。

不过,你的第二个选项更具可读性。

Faster is 1 in your case however it isn't a fair comparison. You should ask StringBuilder.AppendFormat() vs StringBuilder.Append(string.Format()) - where the first one is faster due to internal working with char array.

Your second option is more readable though.

同尘 2024-07-23 10:26:28

在 C# 10/.NET 6+ 中,由于新的 内插字符串处理程序内置于编译器中。 事实上,所有这些都会产生等效的编译代码:

sb.Append(string1);
sb.Append("----");
sb.Append(string2);
sb.AppendFormat("{0}----{1}",string1,string2);
sb.Append($"{string1}----{string2}");

因此,您可以选择最具可读性的代码。

In C# 10/.NET 6+, both the code examples compile down to the same code, due to the new interpolated string handlers built into the compiler. In fact, all of these produce equivalent compiled code:

sb.Append(string1);
sb.Append("----");
sb.Append(string2);
sb.AppendFormat("{0}----{1}",string1,string2);
sb.Append(
quot;{string1}----{string2}");

Because of that, you can go with whatever is most readable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文