字典中的条目有限制吗?
我有大约 3000 个不同的文件需要在游戏过程中的不同时间进行组织和检索。
我创建了自己的变量结构。 我正在考虑创建一本“词典” 在我的应用程序开始时,只需在游戏开始之前加载我的所有文件。
我想知道性能:包含这么多条目的字典会导致我的应用程序变慢吗? 大字典会使“TryGetValue”和“ContainsKey”运行速度变慢吗?
谢谢你的建议!
I have about 3000 different files I need to organize, and retrieve at different times during the game.
I created my own struct of variables.
I was thinking about creating a "Dictionary "
at the beginning of my application, and simply loading all my files before the game starts.
I'm wondering about performance: will a dictionary with this many entries cause my application to be slow?
Would a large dictionary make "TryGetValue" and "ContainsKey" run slower?
thanks for the advice!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
只要密钥具有良好分布的哈希值,TryGetValue 和 ContainsKey 在该大小下应该相当快。
字典具有可索引数量的“桶”。当它通过键添加或查找值时,它将获取 GetHashCode() 返回的值,再次将其散列到小于桶的数量(通常是简单的东西,如取模,但未定义实现),并查看相关的存储桶。
该存储桶当前将有零个或多个项目。字典将使用 .Equals() 将每个项目与键进行比较。
找到正确的存储桶的第一步将在常数时间 O(1) 内完成。将键与存储桶中的键进行比较的第二位将在线性时间 O(n) 中,其中 n 仅与该存储桶中的项目数相关,而不与整个集合中的项目数相关。
一般来说,每个桶中的项目应该很少(桶的数量将增加以尽量保持这种情况),因此操作本质上是恒定时间。
然而,如果你的哈希码实现得不好,同一个桶中将会有很多键。时间复杂度将越来越接近 O(n),通过用一个故意设置不好的 GetHashCode 每次都返回 0 的对象进行实验可以看出。在最坏的情况下,它比 List 更糟糕,因为 List 也是 O(n),但 Dictionary 的开销更大。
这些是否意味着您应该担心?不,即使是相对简单的哈希方法也应该给出相对好的结果。如果您使用字符串键,那么它可能已经足够好了。如果您使用简单的内置类型,则更是如此。
如果您确实发现访问字典很慢,那么您需要注意这一点,并修复 GetHashCode() 方法或创建一个 IEqualityComparer(它允许您为 GetHashCode() 和 Equals() 定义外部规则,以便与字典、哈希集等)。
不过最有可能的是,3000没什么,没关系。
TryGetValue and ContainsKey should be pretty fast at that size, as long as the key has well distributed hashes.
A Dictionary has an indexable number of "buckets". When it adds or looks for a value by a key it will take the value returned by GetHashCode(), hash it down again to be less than the number of buckets (generally something simple like modulo, but the implementation isn't defined), and look in the relevant bucket.
The bucket will currently have zero or more items. The dictionary will compare each item with the key using .Equals().
The first bit of finding the right bucket is going to be in constant time O(1). The second bit of comparing the key with the keys in the bucket is going to be in lineary time O(n) where n relates only to the number of items in that bucket, not in the whole collection.
Generally there should be very few items in each bucket (the number of buckets will grow to try to keep this the case) so the operation is essentially constant time.
If however your hash codes are poorly implemented, there will be lots of keys in the same bucket. The time complexity will get closer and closer to O(n), as can be seen by experimenting with an object with a deliberately bad GetHashCode that just returns 0 every time. In its worse case it is worse than a List, since a List is also O(n), but Dictionary has more overhead.
Does any of this mean you should worry? No, even relatively naïve hashing methods should give relatively good results. If you're using a string key, then it's probably already going to be more than good enough. If you're using a simple built-in type, then even more so.
If you do find that accessing the dictionary is slow though, then you want to pay attention to this and either fix the GetHashCode() method or create an IEqualityComparer (which lets you define outside rules for GetHashCode() and Equals() for use with dictionaries, hashsets, etc).
Most likely though, 3000 is nothing, it'll be fine.
对于
Dictionary>>
来说,3000 个条目实在是太少了。这不会成为经济放缓的根源。另一方面,在启动时将 3000 个不同的文件读入内存,会很慢。仅在需要时将文件读入内存,然后将它们保留在内存中以供后续访问,效果会更好。
3000 entries is piddling for a
Dictionary<>
. That will not be a source of slowdown.Reading 3000 different files into memory at startup, on the other hand, will be slow. You'll be much better off reading files into memory only at the time they're needed, but keeping them in memory afterwards for subsequent accesses.
不,不会。它会消耗内存,但
TryGetValue
和ContainKey
应该相当快,因为字典是一个哈希表,并且通过键对元素的访问是恒定的,并且不依赖于元素数量。No it won't. It will consume memory but
TryGetValue
andContainKey
should be pretty fast as a dictionary is a hashtable and access to the elements by the key is constant and it won't depend on the number of elements.为字典键类型提供哈希码算法可以将哈希码相对均匀地分布在 Int32 空间中,哈希码查找不受字典大小的影响。
有关更多详细信息,请参阅 http://en.wikipedia.org/wiki/Hashtable#Performance_analysis 。
Providing the hashcode algorithm for the dictionary key type spreads the hashcodes relatively evenly across the Int32 space, hashcode lookup is unaffected by dictionary size.
See http://en.wikipedia.org/wiki/Hashtable#Performance_analysis for more details.
.NET 中的字典使用哈希表查找方案,因此添加条目对查找性能的影响非常小(如果有的话)。您遇到的唯一问题可能是内存使用情况。包含 3000 个项目的字典将消耗大约 3000 倍于键加上值类型所使用的存储空间。如果它只是一个没有巨大二进制 blob 的简单结构,那么 3000 是非常小的。
Dictionaries in .NET use a hash table lookup scheme, so adding entries has very little, if any, effect on lookup performance. The only issue you will have might be memory usage. A dictionary of 3000 items will consume roughly 3000 times the storage used by the key plus the value types. If it's just a simple struct without huge binary blobs, 3000 is downright tiny.
你的瓶颈不是字典的性能,而是 3000 个文件的读取。
Your bottleneck will not be the dictionary's performance but rather the reading of 3000 files.
与计算机的大多数事物(特别是性能)一样,“这取决于(tm)”
这一切都取决于字典的实现。
它可以作为二叉树来完成,在这种情况下查找应该是 O(log2 N),这意味着查找时间随着字典大小的增长而缓慢增长。
它可以作为哈希表来完成,理论上,它的复杂度是 O(1),这意味着无论字典的大小如何,查找总是花费相同的时间,但这就是理论,并且取决于桶的数量和哈希码的质量。如果许多项目最终都在同一个存储桶中,需要线性搜索,那么随着字典的增长,速度会大大减慢。
然而,字典必须增长到超过 3000 个数量级,才能看到明显的差异。
As with most things with computers (and particular with performance), "It Depends (tm)"
It all depends on the implementation of the Dictionary.
It could be done as a binary tree, in which case lookup should be O(log2 N), which means lookup time grows slowly as the size of the dictionary grows.
It could be done as a hash table, which, in theory, is O(1), which mean that a lookup will always take the same amount of time regardless of the size of the dictionary, but that's the theory, and depend on the number of buckets and quality of the hash code. If many items end up in the same bucket, requiring a linear search, things will slow down considerably as the dictionary grows.
However, the dictionary would have to grow beyond 3000 by several orders of magnitude before you see a noticeable difference.