东京橱柜能正常工作的最大尺寸是多少?
我在一篇名为“Hands-on Cassandra”的文章中读到,Tokyo Cabinet 不适合大数据。为什么? TC 需要存储多少字节才能开始工作?是否可以确定一个近似值?
I read on an article called "Hands-on Cassandra" that Tokyo Cabinet is not good for big data. Why? How many bytes TC needs to store before start to work bad? Is is possible to determine a approximated value?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据这篇文章,有一个确认性能下降超过 500GB。
基于对 NoSQL 数据库的广泛比较,TC 中的问题始于 >20mm 的行。
大小依赖的可能原因之一是,TC 似乎是使用哈希实现的,并且在某些时候您会遇到哈希键冲突,这当然会破坏性能。默认情况下,键空间不会尽可能大(您需要调整“bnum”参数 - 存储桶数组的元素数量 - 以提高性能)
基于各种比较,MongoDB 似乎是大型数据集的推荐方法。
Based on this article, there's a confirmed performance degradation past 500GB.
Based on this wide comparison of NoSQL databases, the problems in TC start at >20mm rows.
Among the possible causes of size dependency is the fact that it seems TC is implemented using hashes, and at some point you run into hash key collisions which of course ruins the performance. By default, key space is not as large as can be (you need to tune "bnum" parameter - number of elements of the bucket array - to increase performance)
Based on various comparisons, MongoDB seems to be the recommended approach for large datasets.