树和字典的合理大小
我目前正在实现一个非常复杂的树结构,以允许近乎即时的数据访问,而不是对每个请求进行重新处理。
我只是想知道是否存在理论上或实践上的限制,在该限制下树的大小会变得太大,或者在某个点上字典会变得过于充满碰撞而无法正确/快速运行?
一般性的答案将不胜感激,但 C# 特定的信息会更好!
I'm currently implementing a very complex tree structure to allow for near-instant data access, instead of re-processing on each request.
I'm just wondering if there is a theoretical, or practical, limit at which the size of a tree becomes too big, or a point at which a dictionary becomes too collision-filled to function correctly/quickly?
A general answer would be appreciated but C#-specific information would be much better!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在 .NET 中,树或字典的最大值是 2^31 - 1(并且可能会少一些开销)。
实际上,您可能在那之前很久就已经耗尽内存了!
如果树保持平衡,那么搜索将保持大约。 O(logN)。
字典对所使用的底层算法更加敏感,例如有许多具有不同特征的哈希方案。
In .NET the maximum in either a Tree or Dictionary is 2^31 - 1 (and probably a few less with overhead).
In practical terms, you will probably have run out of memory long before then!
If the tree remains balanced then searches will remain approx. O(log N).
Dictionaries are more sensitive to the underlying algorithm used, for instance there are many hashing schemes with different characteristics.
取决于你认为什么是大金额。 数百或数千应该没问题,但数百万可能值得寻找专门的东西。
随着您的成长,树会变慢,具体取决于您的存储技术和重新平衡。
字典应该相当一致,但请确保您构建的字典大小适合您可能存储的数据量(为了安全起见,可能是 x2)。
Depends what you consider a large amount. Hundreds or Thousands should be OK, but millions may be worth looking for something specialised.
A tree will get slower as you grow, depending upon your storage technique and rebalancing.
A dictionary should be fairly consistent, but make sure you construct it with a size appropriate to the amount of data you're likely to store (maybe x2 to be safe).
请参阅此问题 - 这是我在 So 上回答的第一个问题之一:)
问题是构建包含大约 900,000 个项目的字典时性能缓慢。 我把时间从 10 多分钟缩短到了 0.34 秒。
教训是,字典的好坏取决于你的哈希函数,如果你能快速生成一个唯一的哈希值,它就会像闪电一样运行。
希望这会有所帮助,
编辑:
比较类并不重要,.net 字符串具有非常强大的哈希函数,因此在字典中具有出色的性能。 如果他使用单个字符串而不是成对的字符串,那么这个问题就会“消失”。
See this question - it's one of the first I answered on So :)
The problem was slow performance building a dictionary with approx 900,000 items. I got the time down from 10+ minutes to 0.34 seconds.
The lesson being, the dictionary is only as good as your hash function, if you can generate a unique hash quickly, it'll run like lightening.
Hope this helps,
EDIT:
The comparison class isn't important, .net strings have a very strong hash function and - therefore - have excellent performance in dictionaries. That guys problem would have just "gone away" if he'd been using single strings instead of pairs of strings.