具有大量读写的海量数据的最佳数据存储
我需要在数据库中存储大约 1 亿条记录。每天大约会删除其中 60-70% 的记录,并且每天会插入相同数量的记录。我觉得像 Hbase、Big Table 这样的文档数据库就适合这个。还有许多其他数据存储,例如 Cassandra、MongoDb 等。哪种数据存储对于此类问题很有用,因为每天都会有大量的读/写(数量级为数百万)。
I need to store around 100 millions of records on the database. Around 60-70% of them will be deleted daily and same amount of records are inserted daily. I feel a document database like Hbase, Big Table would fit in this. There are many other data stores like Cassandra, MongoDb, etc. Which data store would be useful for this kind of problem as there will be huge amount of reads/writes(order of 10's of millions) daily.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据您提到的特征(JSON 文档、按键访问、一亿条记录、平衡读/写)我会说 CouchDB 或 Membase 是不错的候选者(这里有一个<一href="http://vschart.com/compare/membase/vs/couchdb" rel="nofollow">快速比较)
HBase 和 Cassandra 可能也可以工作,但对于 HBase 你需要安装许多组件(Hadoop、ZooKeeper 等)您不会真正使用,而仅在写入次数多于读取次数时使用 Cassandra(至少在我上次使用它时),Cassandra 会更好。
不幸的是,大表是谷歌内部的:)
Based on the characteristics you've mentioned (JSON Documents, accesses by key, 100 million records, balanced read/write) I'd say CouchDB or Membase are good candidates (here's a quick comparison)
Both HBase and Cassandra can probably also work but for HBase you'd need to install a lot of components (Hadoop, ZooKeeper etc) that you won't really use d only use and Cassandra is better when you have more writes than read (at least the last time I used it).
Big Table, is unfortunately internal to google : )