.net 的高效 trie 实现
我正在寻找 .net 的 trie 实现。
我计划将它用作内存中对象池的索引结构。它不需要是线程安全的(因为只有一个线程会更新它),但应该能够以稳定的性能优雅地处理至少 2000 万个项目。
我在网上找到的似乎是示例代码或玩具项目。所以,我真的在寻找生产质量的实现。如果有的话,商业图书馆也可以。
PS:我选择了尝试,因为我看到哈希表实现似乎使用了太多内存,并且往往会导致内存碎片,因为它们基于数组。任何具有 O(1) 查找特性和大量项目的良性内存使用特性的此类容器也可以。
谢谢你,
I am looking for a trie implementation for .net.
I am planning to use it as the index structure for my in-memory object pool. It need not be thread safe (as only one thread will be updating it) but should be able to cope with at least 20 million items gracefully and with constant performance.
The ones I found on the net seems to be sample code or toy projects. So, I am really looking for a production quality implementation. Commercial libraries are also OK, if available.
PS:I selected tries as it seems hash table implementations that I have seen use too much memory and tend to cause memory fragmentations as they are based on arrays. Any such container with O(1) lookup characteristics and benign memory usage characteristics for large number of items could also be OK.
Thank you,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
看看这个库: TrieNet
Take a look at this library: TrieNet
在我个人看来,我不建议尝试对 .Net 自己的内存管理进行事后猜测。您根本无法像在本机场景中那样对内存分配施加控制级别,但同样您也不需要这样做。当我第一次从 C++ 转向时(我会定期使用自己的堆并编写内存本地化例程等),我就痴迷于这样做,但很快就发现我不需要这样做,也不需要 < em>可以 I.
例如,您可以在 trie 的底部有一个
MyPooledObject
数组,但是,如果这是一个引用类型,那么您只有一个数组参考文献,其中每个的实际内存位于其他地方 - 您无法控制(除非您为运行时调整自己的主机)。这就需要使用值类型来代替 - 但这些根本不适合在池场景中使用,因为自定义值类型应该是不可变的(我可以安全地说,无需证明它 - 只需谷歌“不可变”和“结构”目标网站:stackoverflow.com 查看更多),因此被视为可重用对象是没有好处的。
如果您需要 .Net 中对象的索引集合,其中每个对象都可以使用支持散列的键进行识别,那么请使用字典。
如果您有太多对象无法容纳在内存中,则可以:
1) 获取更多内存
2) 使用数据库并缓存其本地段
或两者兼而有之:您可以考虑查看 AppFabric 及其缓存功能,这样您就可以构建一个专用于运行数百万个对象的内存缓存的机器场。硬件成本可能会低于为 .Net 开发自己的内存管理解决方案的成本:)
In my personal opinion attempting to second-guess .Net's own memory management is not a practise I'd recommend. You simply can't exert the level of control over memory allocation that you can in a native scenario, but equally you shouldn't need to. I was obsessed by a desire to do this when I first moved from C++ (where I would regularly work with my own heaps and write memory-localisation routines etc), but it swiftly became apparent that I just didn't need to, nor could I.
For example, you could have an array of
MyPooledObject
at the bottom of your trie, but, if that is a reference type, then you've just got an array of references, where the actual memory for each is somewhere else - that you can't control (unless you adapt your own host for the runtime).That leaves using a value-type instead - but these are simply not suitable for use in a pooled scenario, because custom value types should be immutable (I can say that safely without justifying it - just google 'immutable' and 'struct' targetting site:stackoverflow.com to see more) and therefore no good to be treated as reusable objects.
If you need an indexed collection of objects in .Net where each is recognisable with a hash-capable key, then use a Dictionary.
If you have too many objects to fit in memory then either:
1) Get more memory
2) Use a database and cache local segments of it
Or both: You could consider looking at AppFabric and its cache features, that way you can build a farm of machines dedicated to running in-memory caches of millions of objects. The cost of the hardware will probably be less than the cost of developing your own memory management solution for .Net :)