集合类型的初始容量,例如字典、列表
.Net 中的某些集合类型具有可选的“初始容量”构造函数参数。例如:
Dictionary<string, string> something = new Dictionary<string,string>(20);
List<string> anything = new List<string>(50);
我似乎无法在 MSDN 上找到这些对象的默认初始容量是多少。
如果我知道我只会在字典中存储 12 个左右的项目,那么将初始容量设置为 20 之类的值是否有意义?
我的推理是,假设容量像 StringBuilder 一样增长,每次容量达到时都会加倍,并且每次重新分配的成本都很高,为什么不将大小预先设置为您知道将保存数据的大小,并添加一些额外的大小房间以防万一?如果初始容量是 100,并且我知道我只需要十几个左右,那么似乎其余的内存都没有分配。
Certain collection types in .Net have an optional "Initial Capacity" constructor parameter. For example:
Dictionary<string, string> something = new Dictionary<string,string>(20);
List<string> anything = new List<string>(50);
I can't seem to find what the default initial capacity is for these objects on MSDN.
If I know I will only be storing 12 or so items in a dictionary, doesn't it make sense to set the initial capacity to something like 20?
My reasoning is, assuming that the capacity grows like it does for a StringBuilder, which doubles each time the capacity is hit, and each reallocation is costly, why not pre-set the size to something you know will hold your data, with some extra room just in case? If the initial capacity is 100, and I know I will only need a dozen or so, it seems as though the rest of that memory is allocated for nothing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果未记录默认值,原因可能是最佳初始容量是实现细节,并且可能会在框架版本之间发生变化。也就是说,您不应该编写假定某个默认值的代码。
构造函数重载容量适用于您比类更了解预期项目数量的情况。例如,如果您创建一个包含 50 个值的集合,并且知道该数字永远不会增加,则可以将该集合初始化为容量 50,这样,如果默认容量较低,则无需调整大小。
也就是说,您可以使用 Reflector 确定默认值。例如,在 .NET 4.0(也可能是以前的版本)中,
a List初始化为容量 0。当添加第一个项目时,它被重新初始化为容量 4。随后,每当达到容量时,容量就会加倍。
字典容量也初始化为 0。但它使用完全不同的算法来增加容量:它总是将容量增加到素数。
If the default values are not documented, the reason is likely that the optimal initial capacity is an implementation detail and subject to change between framework versions. That is, you shouldn't write code that assumes a certain default value.
The constructor overloads with a capacity are for cases in which you know better than the class what number of items are to be expected. For example, if you create a collection of 50 values and know that this number will never increase, you can initialize the collection with a capacity of 50, so it won't have to resize if the default capacity is lower.
That said, you can determine the default values using Reflector. For example, in .NET 4.0 (and probably previous versions as well),
a List<T> is initialized with a capacity of 0. When the first item is added, it is reinitialized to a capacity of 4. Subsequently, whenever the capacity is reached, the capacity is doubled.
a Dictionary<T> is intialized with a capacity of 0 as well. But it uses a completely different algorithm to increase the capacity: it increases the capacity always to prime numbers.
如果您知道尺寸,请告知;在大多数“小”情况下进行较小的优化,但对于更大的集合很有用。如果我放入“相当多”的数据,我会主要担心这一点,因为这样可以避免分配、复制和收集多个数组。
大多数集合确实使用了加倍策略。
If you know the size, then tell it; a minor optimisation in most "small" cases, but useful for bigger collections. I would mainly worry about this if I am throwing a "decent" amount of data in, as it can then avoid having to allocate, copy and collect multiple arrays.
Most collections indeed use a doubling strategy.
查看源码,
List
和Dictionary
的默认容量都是 0。Checking the source, the default capacity for both
List<T>
andDictionary<TKey, TValue>
is 0.ConcurrentDictionary(当前)和使用其构造函数设置初始大小的另一个问题是其性能似乎受到阻碍。
例如,这里有一些示例代码和基准我尝试过。
我在我的机器上运行了代码并得到了类似的结果。
也就是说,当指定初始大小时,它不会提高 ConcurrentDictionary 添加对象时的速度。从技术上讲,我认为它应该,因为它不需要花费时间或资源来调整自身大小。
是的,它的运行速度可能不如普通字典,但我仍然期望设置了初始大小的 ConcurrentDictionary 比没有设置初始大小的 ConcurrentDictionary 具有一致、更快的性能,尤其是当人们提前知道时要添加到其中的项目数量。
所以这个故事的寓意是设置初始大小并不总是能保证性能的提高。
Another issue with the ConcurrentDictionary (currently) and using its constructor to set an initial size is that its performance appears to be hindered.
For example, here's some example code and benchmarks I tried.
I ran the code on my machine and got similar results.
That is, when the initial size is specified, it does nothing to increase the ConcurrentDictionary's speed when adding objects. Technically, I think it should because it doesn't have to take time or resources to resize itself.
Yes, it may not run as fast as a normal Dictionary, but I would still expect a ConcurrentDictionary with its initial size set to have consistent, faster performance than a ConcurrentDictionary that doesn't have its initial size set, especially when one knows in advance the number of items that are going to be added to it.
So the moral of the story is setting the initial size doesn't always guarantee a performance improvement.
在 Visual Studio 中使用此正则表达式
new List[<].*[>][(\(\))?[ ]+[{]
Ctrl+< kbd>Shift+F 使用正则表达式选项来搜索您可能需要为其添加初始容量的所有列表;-)Use this regular expression
new List[<].*[>][(\(\))?[ ]+[{]
in Visual Studio Ctrl+Shift+F with regular expression option on to search all lists that you might have to add an initial capacity to it ;-)