我知道这个问题已经完成但我有一个稍微不同的转折。 有些人指出这是不成熟的优化,如果我只是为了实用而要求实用性,那么这是完全正确的。 我的问题源于一个实际问题,但我仍然很好奇。
我正在创建一堆 SQL 语句来创建一个脚本(因为它将被保存到磁盘)来重新创建数据库模式(很容易就有数百个表、视图等)。 这意味着我的字符串连接是仅附加的。 根据 MSDN 的说法,StringBuilder 的工作原理是保留一个内部缓冲区(肯定是 char[])并将字符串字符复制到其中,并根据需要重新分配数组。
但是,我的代码有很多重复字符串(“CREATE TABLE [”、“GO\n”等),这意味着我可以利用它们 被实习,但如果我使用 StringBuilder 则不会,因为它们每次都会被复制。 唯一的变量本质上是表名,并且这些变量已经作为字符串存在于内存中的其他对象中。
据我所知,在读入数据并创建保存架构信息的对象后,我所有的字符串信息都可以通过实习重用,是吗?
假设如此,那么字符串的 List 或 LinkedList 不会更快吗,因为它们保留了指向内部字符串的指针? 然后,只需调用一次 String.Concat() 即可对整个字符串进行一次内存分配,且长度完全正确。
列表必须重新分配驻留指针的 string[],而链表必须创建节点并修改指针,因此它们不是“自由”的,但如果我连接数千个驻留字符串< /b> 那么他们看起来会更有效率。
现在我想我可以对每个 SQL 语句的字符计数提出一些启发式的方法。 计算每种类型并获得一个粗略的想法,并预先设置我的 StringBuilder 容量以避免重新分配其 char[],但我必须以相当大的幅度进行超调以减少重新分配的可能性。
因此,对于这种情况,获得单个连接字符串最快的方法是:
- StringBuilder
- List 驻留字符串
- LinkedList; 实习字符串
- 具有容量启发式的
- StringBuilder还有其他吗?
作为上面的一个单独的问题(我可能并不总是去磁盘):输出文件的单个 StreamWriter 会更快吗? 或者,使用列表或链接列表,然后将它们从列表写入文件,而不是首先在内存中连接。
编辑:
根据要求,参考 (.NET 3.5)到 MSDN。 它说:“如果空间可用,则新数据将附加到缓冲区的末尾;否则,将分配一个新的、更大的缓冲区,将原始缓冲区中的数据复制到新缓冲区,然后附加新数据对我来说,这意味着重新分配 char[] 以使其更大(这需要将旧数据复制到调整大小的数组),然后追加。
I know this question has been done but I have a slightly different twist to it. Several have pointed out that this is premature optimization, which is entirely true if I were asking for practicality's sake and practicality's sake only. My problem is rooted in a practical problem but I'm still curious nonetheless.
I'm creating a bunch of SQL statements to create a script (as in it will be saved to disk) to recreate a database schema (easily many many hundreds of tables, views, etc.). This means my string concatenation is append-only. StringBuilder, according to MSDN, works by keeping an internal buffer (surely a char[]) and copying string characters into it and reallocating the array as necessary.
However, my code has a lot of repeat strings ("CREATE TABLE [", "GO\n", etc.) which means I can take advantage of them being interned but not if I use StringBuilder since they would be copied each time. The only variables are essentially table names and such that already exist as strings in other objects that are already in memory.
So as far as I can tell that after my data is read in and my objects created that hold the schema information then all my string information can be reused by interning, yes?
Assuming that, then wouldn't a List or LinkedList of strings be faster because they retain pointers to interned strings? Then it's only one call to String.Concat() for a single memory allocation of the whole string that is exactly the correct length.
A List would have to reallocate string[] of interned pointers and a linked list would have to create nodes and modify pointers, so they aren't "free" to do but if I'm concatenating many thousands of interned strings then they would seem like they would be more efficient.
Now I suppose I could come up with some heuristic on character counts for each SQL statement & count each type and get a rough idea and pre-set my StringBuilder capacity to avoid reallocating its char[] but I would have to overshoot by a fair margin to reduce the probability of reallocating.
So for this case, which would be fastest to get a single concatenated string:
- StringBuilder
- List<string> of interned strings
- LinkedList<string> of interned strings
- StringBuilder with a capacity heuristic
- Something else?
As a separate question (I may not always go to disk) to the above: would a single StreamWriter to an output file be faster yet? Alternatively, use a List or LinkedList then write them to a file from the list instead of first concatenating in memory.
EDIT:
As requested, the reference (.NET 3.5) to MSDN. It says: "New data is appended to the end of the buffer if room is available; otherwise, a new, larger buffer is allocated, data from the original buffer is copied to the new buffer, then the new data is appended to the new buffer." That to me means a char[] that is realloced to make it larger (which requires copying old data to the resized array) then appending.
发布评论
评论(7)
对于您的单独的问题,Win32 有一个 WriteFileGather 函数,它可以有效地将(驻留的)字符串列表写入磁盘 - 但只有在异步调用时才会产生显着差异,因为磁盘写入将掩盖所有但非常大的串联。
对于您的主要问题:除非您的脚本达到兆字节或数万个脚本,否则请不要担心。
您可以期望 StringBuilder 在每次重新分配时将分配大小加倍。 这意味着将缓冲区从 256 字节增加到 1MB 只需 12 次重新分配 - 相当不错,因为您的初始估计与目标相差 3 个数量级。
纯粹作为练习,一些估计:构建 1MB 的缓冲区将占用大约 3MB 内存(1MB 源、1MB 目标、1MB 由于
重新分配期间复制)。
链表实现将扫描大约 2MB(并且忽略每个字符串引用的 8 字节/对象开销)。 因此,与 10Gbit/s 的典型内存带宽和 1MB L2 缓存相比,您节省了 1 MB 内存读/写。)
是的,列表实现可能更快,如果您的缓冲区大一个数量级,则差异将很重要。
对于更常见的小字符串情况,算法增益可以忽略不计,并且很容易被其他因素抵消:StringBuilder 代码可能已经在代码缓存中,并且是微优化的可行目标。 此外,如果最终字符串适合初始缓冲区,则在内部使用字符串意味着根本不进行复制。
使用链表还会将重新分配问题从 O(字符数)降低到 O(段数) - 您的字符串引用列表面临与字符串相同的问题!
So, IMO the implementation of StringBuilder is the right choice, optimized for the common case, and degrades mostly for unexpectedly large target buffers. I'd expect a list implementation to degrade for very many small segments first, which is actually the extreme kind of scenario StringBuilder is trying to optimize for.
尽管如此,对这两种想法进行比较,以及列表何时开始变得更快,还是很有趣的。
For your separate question, Win32 has a WriteFileGather function, which could efficiently write a list of (interned) strings to disk - but it would make a notable difference only when being called asynchronously, as the disk write will overshadow all but extremely large concatenations.
For your main question: unless you are reaching megabytes of script, or tens of thousands of scripts, don't worry.
You can expect StringBuilder to double the allocation size on each reallocation. That would mean growing a buffer from 256 bytes to 1MB is just 12 reallocations - quite good, given that your initial estimate was 3 orders of magnitude off the target.
Purely as an exercise, some estimates: building a buffer of 1MB will sweep roughly 3 MB memory (1MB source, 1MB target, 1MB due to
copying during realloation).
A linked list implementation will sweep about 2MB, (and that's ignoring the 8 byte / object overhead per string reference). So you are saving 1 MB memory reads/writes, compared to a typical memory bandwidth of 10Gbit/s and 1MB L2 cache.)
Yes, a list implementation is potentially faster, and the difference would matter if your buffers are an order of magnitude larger.
For the much more common case of small strings, the algorithmic gain is negligible, and easily offset by other factors: the StringBuilder code is likely in the code cache already, and a viable target for microoptimizations. Also, using a string internally means no copy at all if the final string fits the initial buffer.
Using a linked list will also bring down the reallocation problem from O(number of characters) to O(number of segments) - your list of string references faces the same problem as a string of characters!
So, IMO the implementation of StringBuilder is the right choice, optimized for the common case, and degrades mostly for unexpectedly large target buffers. I'd expect a list implementation to degrade for very many small segments first, which is actually the extreme kind of scenario StringBuilder is trying to optimize for.
Still, it would be interesting to see a comparison of the two ideas, and when the list starts to be faster.
如果我要实现这样的东西,我永远不会构建 StringBuilder (或脚本的内存缓冲区中的任何其他内容)。
我只是将其流式传输到您的文件中,并使所有字符串内联。
这是一个示例伪代码(语法上不正确或任何其他内容):
然后,您将永远不需要脚本的内存表示,以及所有字符串的复制。
意见?
If I were implementing something like this, I would never build a StringBuilder (or any other in memory buffer of your script).
I would just stream it out to your file instead, and make all strings inline.
Here's an example pseudo code (not syntactically correct or anything):
Then, you'll never need an in memory representation of your script, with all the copying of strings.
Opinions?
根据我的经验,对于大量字符串数据,我正确分配的 StringBuilder 的性能优于大多数其他方法。 为了防止重新分配,甚至超出估计值 20% 或 30%,浪费一些内存也是值得的。 我目前没有硬性数字来使用我自己的数据来支持它,但请看一下 此页面了解更多信息。
但是,正如 Jeff 喜欢指出的那样,不要过早优化!
编辑:正如 @Colin Burnett 指出的,Jeff 进行的测试与 Brian 的测试不一致,但链接 Jeff 的帖子的目的是关于一般的过早优化。 杰夫页面上的几位评论者指出了他的测试存在的问题。
In my experience, I properly allocated StringBuilder outperforms most everything else for large amounts of string data. It's worth wasting some memory, even, by overshooting your estimate by 20% or 30% in order to prevent reallocation. I don't currently have hard numbers to back it up using my own data, but take a look at this page for more.
However, as Jeff is fond of pointing out, don't prematurely optimize!
EDIT: As @Colin Burnett pointed out, the tests that Jeff conducted don't agree with Brian's tests, but the point of linking Jeff's post was about premature optimization in general. Several commenters on Jeff's page noted issues with his tests.
实际上,
StringBuilder
在内部使用了String
的实例。String
实际上在System
程序集中是可变的,这就是为什么StringBuilder
可以在其之上构建。 通过在创建实例时分配合理的长度,您可以使StringBuilder
更加有效。 这样您将消除/减少调整大小操作的数量。字符串驻留适用于可以在编译时识别的字符串。 因此,如果您在执行期间生成大量字符串,它们将不会被保留,除非您自己通过调用字符串上的 interning 方法来执行此操作。
只有当你的琴弦相同时,实习才会对你有利。 几乎相同的字符串不会从驻留中受益,因此
"SOMESTRINGA"
和"SOMESTRINGB"
即使被驻留,也将是两个不同的字符串。Actually
StringBuilder
uses an instance ofString
internally.String
is in fact mutable within theSystem
assembly, which is whyStringBuilder
can be build on top of it. You can makeStringBuilder
a wee bit more effective by assigning a reasonable length when you create the instance. That way you will eliminate/reduce the number of resize operations.String interning works for strings that can be identified at compile time. Thus if you generate a lot of strings during the execution they will not be interned unless you do so yourself by calling the interning method on string.
Interning will only benefit you if your strings are identical. Almost identical strings doesn't benefit from interning, so
"SOMESTRINGA"
and"SOMESTRINGB"
will be two different strings even if they are interned.如果所有(或大多数)被连接的字符串都被保留,那么您的方案可能会提高性能,因为它可能会使用更少的内存,并且可以保存一些大的字符串副本。
然而,它是否真正提高了性能取决于您正在处理的数据量,因为改进是恒定因素,而不是算法的数量级。
真正判断的唯一方法是使用这两种方式运行您的应用程序并测量结果。 但是,除非您面临巨大的内存压力,并且需要一种节省字节的方法,否则我不会打扰,只会使用字符串生成器。
If all (or most) of the strings being concatenated are interned, then your scheme MIGHT give you a performance boost, as it could potentally use less memory, and could save a few large string copies.
However, whether or not it actually improves perf depends on the volume of data you are processing, because the improvement is in constant factors, not in the order of magnitude of the algorithm.
The only way to really tell is to run your app using both ways and measure the results. However, unless you are under significant memory pressure, and need a way to save bytes, I wouldn't bother and would just use string builder.
StringBuilder
不使用char[]
来存储数据,它使用内部可变字符串。 这意味着不需要额外的步骤来创建最终字符串,就像连接字符串列表时一样,StringBuilder
只是将内部字符串缓冲区作为常规字符串返回。StringBuilder
为增加容量而进行的重新分配意味着数据平均会额外复制 1.33 次。 如果您可以在创建 StringBuilder 时提供对大小的良好估计,则可以进一步减小该大小。然而,为了获得一些视角,您应该看看您正在尝试优化的内容。 程序中花费的大部分时间是将数据实际写入磁盘,因此即使您可以将字符串处理优化为使用 StringBuilder 的两倍(这不太可能) ,总体差异仍然只有百分之几。
A
StringBuilder
doesn't use achar[]
to store the data, it uses an internal mutable string. That means that there is no extra step to create the final string as it is when you concatenate a list of strings, theStringBuilder
just returns the internal string buffer as a regular string.The reallocations that the
StringBuilder
does to increase the capacity means that the data is by average copied an extra 1.33 times. If you can provide a good estimate on the size when you create theStringBuilder
you can reduce that even furter.However, to get a bit of perspective, you should look at what it is that you are trying to optimise. What will take most of the time in your program is to actually write the data to disk, so even if you can optimise your string handling to be twice as fast as using a
StringBuilder
(which is very unlikely), the overall difference will still only be a few percent.你考虑过用C++来做这个吗? 是否有一个库类已经构建了 T/SQL 表达式,最好用 C++ 编写。
字符串中最慢的是 malloc。 在 32 位平台上每个字符串需要 4KB。 考虑优化创建的字符串对象的数量。
如果您必须使用 C#,我会推荐这样的内容:
如果性能如此重要,我什至会让计算机评估使用依赖项注入框架进行对象实例化的最佳路径。
Have you considered C++ for this? Is there a library class that already builds T/SQL expressions, preferably written in C++.
Slowest thing about strings is malloc. It takes 4KB per string on 32-bit platforms. Consider optimizing number of string objects created.
If you must use C#, I'd recommend something like this:
I would even go as far as letting the computer evaluate the best path for object instantiation with dependency injection frameworks, if perf is THAT important.