String.Join 与 StringBuilder:哪个更快?
In a previous question about formatting a double[][]
to CSV format, it was suggested that using StringBuilder
would be faster than String.Join
. Is this true?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
简短的回答:这取决于。
长答案:如果您已经有一个要连接在一起的字符串数组(带有分隔符),
String.Join
是最快的方法。String。 Join 可以查看所有字符串以计算出所需的确切长度,然后再次复制所有数据。 这意味着不会涉及额外的复制。 唯一的缺点是它必须遍历字符串两次,这意味着可能会超出必要的次数来破坏内存缓存。
如果您事先没有将字符串作为数组,那么使用
StringBuilder可能会更快 - 但也有可能不是这样的情况t。 如果使用 StringBuilder 意味着进行大量复制,那么构建一个数组然后调用 String.Join 可能会更快。
编辑:这是对
String.Join
的单次调用与对StringBuilder.Append
的一堆调用。 在最初的问题中,我们有两个不同级别的 String.Join 调用,因此每个嵌套调用都会创建一个中间字符串。 换句话说,它更加复杂,更加难以猜测。 我会惊讶地看到任何一种方式在典型数据上都显着“获胜”(就复杂性而言)。编辑:当我在家时,我会编写一个对于 StringBuilder 来说尽可能痛苦的基准测试。 基本上,如果您有一个数组,其中每个元素的大小约为前一个元素的两倍,并且您得到了正确的结果,那么您应该能够强制为每个追加(元素的副本,而不是分隔符的副本,尽管这需要也要考虑在内)。 到那时,它几乎和简单的字符串连接一样糟糕 - 但
String.Join
不会有任何问题。Short answer: it depends.
Long answer: if you already have an array of strings to concatenate together (with a delimiter),
String.Join
is the fastest way of doing it.String.Join
can look through all of the strings to work out the exact length it needs, then go again and copy all the data. This means there will be no extra copying involved. The only downside is that it has to go through the strings twice, which means potentially blowing the memory cache more times than necessary.If you don't have the strings as an array beforehand, it's probably faster to use
StringBuilder
- but there will be situations where it isn't. If using aStringBuilder
means doing lots and lots of copies, then building an array and then callingString.Join
may well be faster.EDIT: This is in terms of a single call to
String.Join
vs a bunch of calls toStringBuilder.Append
. In the original question, we had two different levels ofString.Join
calls, so each of the nested calls would have created an intermediate string. In other words, it's even more complex and harder to guess about. I would be surprised to see either way "win" significantly (in complexity terms) with typical data.EDIT: When I'm at home, I'll write up a benchmark which is as painful as possibly for
StringBuilder
. Basically if you have an array where each element is about twice the size of the previous one, and you get it just right, you should be able to force a copy for every append (of elements, not of the delimiter, although that needs to be taken into account too). At that point it's nearly as bad as simple string concatenation - butString.Join
will have no problems.这是我的测试设备,为简单起见,使用
int[][]
; 结果优先:(更新
双
结果:)(更新re 2048 * 64 * 150)
并且启用OptimizeForTesting:
速度更快,但速度不是很快; 装备(在控制台、发布模式等下运行):
Here's my test rig, using
int[][]
for simplicity; results first:(update for
double
results:)(update re 2048 * 64 * 150)
and with OptimizeForTesting enabled:
So faster, but not massively so; rig (run at console, in release mode, etc):
我不这么认为。 通过 Reflector 来看,
String.Join
的实现看起来非常优化。 它还具有额外的好处,即提前知道要创建的字符串的总大小,因此不需要任何重新分配。我创建了两个测试方法来比较它们:
我运行每个方法 50 次,并传入一个大小为
[2048][64]
的数组。 我对两个数组执行了此操作; 一个用零填充,另一个用随机值填充。 我在我的机器上得到了以下结果(P4 3.0 GHz,单核,无 HT,从 CMD 运行发布模式):将数组的大小增加到
[2048][512]
,同时减少迭代次数达到 10 次后,我得到了以下结果:结果是可重复的(几乎;由不同的随机值引起的小波动)。 显然,
String.Join
在大多数情况下要快一些(尽管差距很小)。这是我用于测试的代码:
I don't think so. Looking through Reflector, the implementation of
String.Join
looks very optimized. It also has the added benefit of knowing the total size of the string to be created in advance, so it doesn't need any reallocation.I have created two test methods to compare them:
I ran each method 50 times, passing in an array of size
[2048][64]
. I did this for two arrays; one filled with zeros and another filled with random values. I got the following results on my machine (P4 3.0 GHz, single-core, no HT, running Release mode from CMD):Increasing the size of the array to
[2048][512]
, while decreasing the number of iterations to 10 got me the following results:The results are repeatable (almost; with small fluctuations caused by different random values). Apparently
String.Join
is a little faster most of the time (although by a very small margin).This is the code I used for testing:
除非 1% 的差异对于整个程序的运行时间而言非常显着,否则这看起来就像是微优化。 我会编写最具可读性/最容易理解的代码,而不用担心 1% 的性能差异。
Unless the 1% difference turns into something significant in terms of the time the entire program takes to run, this looks like micro-optimization. I'd write the code that's the most readable/understandable and not worry about the 1% performance difference.
是的。 如果您执行多次连接,速度会快很多。
当您执行 string.join 时,运行时必须:
如果进行两次联接,则必须复制数据两次,依此类推。
StringBuilder 分配一个缓冲区以供备用,因此可以附加数据而无需复制原始字符串。 由于缓冲区中还有剩余空间,因此可以将附加的字符串直接写入缓冲区中。
然后它只需要在最后复制整个字符串一次。
yes. If you do more than a couple of joins, it will be a lot faster.
When you do a string.join, the runtime has to:
If you do two joins, it has to copy the data twice, and so on.
StringBuilder allocates one buffer with space to spare, so data can be appended without having to copy the original string. As there is space left over in the buffer, the appended string can be written into the buffer directly.
Then it just has to copy the entire string once, at the end.