是否有一个数据存储可以让我直接通过数组索引而不是哈希键访问数据?雷迪斯? MongoDB?
我需要一个用于 Java 应用程序的外部 C/C++ 内存高效(!)数据存储,它没有普通数据库查找(b 树)的缺点,但使用我的 ID 作为数组索引。有没有开源的解决方案?我仅在 C++ 内存中实现了此功能,但我希望有一个“存储到光盘”选项,以防崩溃或备份。 Java 绑定也很酷。
例如,redis 看起来不错,但是在阅读 文档 时,我发现一般情况下,事物是通过具有 O 的哈希键来访问的(1) 仅在理论上 - 或者我可以以某种方式强制哈希方案与存储索引匹配吗?而且列表也不会被占用,因为它们是作为链接列表实现的。或者 mongodb 呢?
是的,我确实需要快速的读取访问(写入可能“还算慢”:)) - 这不是过早的优化,但如果没有其他选择,我会在推出自己的版本之前尝试使用 redis。 Java 也是不可能的(正如我所说:内存效率高;))
I need an external C/C++ memory efficient (!) data storage for a Java app which does not have the downside of a normal database lookup (b tree) but which uses my IDs as array index. Is there an open source solution for this? I implemented this in C++ in-memory only, but I would like to have a "storage to disc" option in case of a crash or for backup. Also Java binding would be cool.
E.g. redis looks good but when reading the docs I see that in general things are accessed by hash keys which have O(1) only in theory - or can I somehow force that the hashing scheme matches the storage index? And also lists are not appropriated as they are implemented as linked lists. Or what about mongodb?
And yes, I really need that fast read access (write can be "okayish slow" :)) - it is no premature optimization but if there is no alternative I'll try redis before rolling my own. Also Java is not possible (as I said: memory efficient ;))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
对于远程键值存储,开销通常主要由网络和协议管理而不是数据访问本身决定。这就是为什么对于高效的键值存储(例如 Redis),几乎所有操作实际上都具有相同的成本。
Redis 基准 页面很好地说明了这一点。
换句话说,在内存远程存储的上下文中,仅考虑延迟,随机访问数组将具有与哈希表相同的精确性能,甚至效率较低的 O(log n) 容器(如红黑)树、B 树等...将非常接近。
如果您确实想要最大的性能,我建议使用嵌入式(即进程内)存储。例如, BerkeleyDB 和 Tokyo Cabinet 提供基于磁盘的随机访问容器对于固定长度的记录。
With a remote key-value store, the overhead is very often dominated by the network and protocol management rather than data access itself. That's why with efficient key-value stores (like Redis for instance), almost all the operations actually have the same cost.
The Redis benchmark page contains a good illustration of this point.
In other words, in the context of an in-memory remote store, and considering only the latency, a random access array will have the same exact performance than a hash table, and even less efficient O(log n) containers like red-black trees, B-trees, etc ... will be quite close.
If you really want maximum performance, I would suggest to use an embedded (i.e. in-process) store. For instance, both BerkeleyDB and Tokyo Cabinet provide disk based random access containers for fixed-length records.
KDB 是解决金融系统(算法交易)领域这一问题的首选解决方案。不过,请做好让你的大脑被语法融化的准备。哦,它不是开源的。
KDB is the go-to solution for this problem in the financial systems (algo trading) world. Be prepared to have your brain melted by the syntax though. Oh, and it is not open source.