分布式网络 B 树
我想构建一个跨越多节点的 B+树
计算机网络(Linux PC 的内部子网)
弹性海量存储。范围扫描很重要。
这基本上是
的底层数据结构吗? 分布式数据库系统? (Cassandra、HBase)
有关于分布式 B+Tree 的研究吗?
我在
看到了这篇文章 http://www.cs.yale.edu /homes/aspnes/papers/opodis2005-b-trees-final.pdf
但跳过 BTree 只是删除错误节点(因此会丢失数据)
我对具有内置冗余的 B+Tree 特别感兴趣
(即,如果主机发生故障并且其托管的所有节点都离线,
我希望另一台复制主机成为主节点
服务器并取代发生故障的主机)
我不想使用数据库实例的集合
(1个节点,一个数据库),因为分片不是一个好的选择
用于大规模存储系统(跨商品
x86、x64 硬件(带有 FOSS 操作系统)。
我是在重新发明轮子吗?
我应该只使用 Cassandra 还是 HBase?
I would like to build a B+tree that spans a multi-node
computer network (internal subnet of Linux PCs) for
elastic massive storage. Range scans are important.
Is this basically the underlying data structure of
distributed DB systems? (Cassandra, HBase)
Is there any research out there on distributed B+Trees?
I saw the article at
http://www.cs.yale.edu/homes/aspnes/papers/opodis2005-b-trees-final.pdf
but skip BTrees just take faulty nodes out (so there's data loss)
I'm particularly interested in B+Trees with built-in redundancy
(i.e. if a host fails and all the nodes it hosts are offline,
I'd like another replicated host to become the primary node
server and take the place of the failed host)
I don't want to use a collection of DB instances
(1 node, one DB) as sharding is not a good choice
for a massively scaled storage system (across commodity
x86,x64 hardware with FOSS OS).
Am I reinventing the wheel?
Should I just use Cassandra or HBase?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Cassandra 支持范围查询。
当您打开计算机时,Google 的 Big Table 会自动将新计算机添加到集群中。它非常有弹性并且可以轻松添加更多机器。不幸的是,它的速度有一个缺点:查询非常严格。您可以进行一些范围查询。请参阅此文章了解列表和更多详细信息: http://geothought.blogspot.com/2009/04/google-app-engine-and-bigtable-very.html
数据如何存储在大表中的一个很好的示例:http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable
不错的堆栈溢出帖子:
在 bigtable 衍生品中存储海量有序时间序列数据
Cassandra supports range queries.
Google's Big Table automatically adds new machines to the cluster when you turn the machine on. It's very elastic and easy to add more machines. Unfortunately its speed comes with a drawback: the queries are very restrictive. You can do some range queries. See this article for a list and more details: http://geothought.blogspot.com/2009/04/google-app-engine-and-bigtable-very.html
A great example how data is stored in Big Table: http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable
A nice stack overflow post:
storing massive ordered time series data in bigtable derivatives