riak单桶备份解决方案
对于允许将单个 riak 存储桶[通过流式传输或快照]备份到文件的解决方案,您有何建议?
What are your recommendations for solutions that allow backing up [either by streaming or snapshot] a single riak bucket to a file?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在 Riak 中,仅备份单个存储桶将是一项困难的操作。
所有解决方案都可以归结为以下两个步骤:
列出存储桶中的所有对象。这是棘手的部分,因为 Riak 集群中的任何位置都没有任何存储桶的“清单”或内容列表。
对上面列表中的每个对象发出 GET 请求,并将其写入备份文件。这部分通常很简单,但为了获得最佳性能,您需要确保以多线程方式并行发出这些 GET,并使用某种连接池。
至于列出所有对象,您可以选择以下三种选择之一。
一种是通过 HTTP(例如
/buckets/bucket/keys?keys=stream
)或 Protocol Buffers 对存储桶执行 Streaming List Keys 操作 - 请参阅 http://docs.basho.com/riak/latest/dev/references/http/list-keys/ 和 http://docs.basho .com/riak/latest/dev/references/protocol-buffers/list-keys/ 了解详细信息。在任何情况下都不应执行非流式常规列表键操作。 (它会挂起整个集群,并且一旦键的数量变得足够大,最终就会超时或崩溃)。二是发出二级索引 (2i) 查询来获取该对象列表。请参阅http://docs.basho.com/riak/latest/dev/using /2i/ 用于讨论和注意事项。
第三个是如果您使用 Riak Search 并且可以检索通过单个分页搜索查询的所有对象。 (但是,Riak Search 的查询结果限制为 10,000 个结果,因此,这种方法远非理想)。
有关可以备份单个存储桶的独立应用程序的示例,请查看 Riak Data Migrator,一个实验性 Java 应用程序,它使用流列表键方法与高效的并行 GET 相结合。
Backing up just a single bucket is going to be a difficult operation in Riak.
All of the solutions will boil down to the following two steps:
List all of the objects in the bucket. This is the tricky part, since there is no "manifest" or a list of contents of any bucket, anywhere in the Riak cluster.
Issue a GET to each one of those objects from the list above, and write it to a backup file. This part is generally easy, though for maximum performance you want to make sure you're issuing those GETs in parallel, in a multithreaded fashion, and using some sort of connection pooling.
As far as listing all of the objects, you have one of three choices.
One is to do a Streaming List Keys operation on the bucket via HTTP (e.g.
/buckets/bucket/keys?keys=stream
) or Protocol Buffers -- see http://docs.basho.com/riak/latest/dev/references/http/list-keys/ and http://docs.basho.com/riak/latest/dev/references/protocol-buffers/list-keys/ for details. Under no circumstances should you do a non-streaming regular List Keys operation. (It will hang your whole cluster, and will eventually either time out or crash once the number of keys grows large enough).Two is to issue a Secondary Index (2i) query to get that object list. See http://docs.basho.com/riak/latest/dev/using/2i/ for discussion and caveats.
And three would be if you're using Riak Search and can retrieve all of the objects via a single paginated search query. (However, Riak Search has a query result limit of 10,000 results, so, this approach is far from ideal).
For an example of a standalone app that can backup a single bucket, take a look at Riak Data Migrator, an experimental Java app that uses the Streaming List Keys approach combined with efficient parallel GETs.
Basho 函数 contrib 有一个用于备份单个存储桶的 erlang 解决方案。这是一个自定义函数,但它应该可以解决问题。
http://contrib.basho.com/bucket_exporter.html
The Basho function contrib has an erlang solution for backing up a single bucket. It is a custom function but it should do the trick.
http://contrib.basho.com/bucket_exporter.html
据我所知,Riak 中没有自动备份单个存储桶的解决方案。您必须使用 riak-admin 命令行工具来负责备份单个物理节点。如果您希望快速但不安全(r = 1),您可以编写一些内容来检索单个存储桶中的所有密钥并使用较低的 r 值。
Buckets是一个逻辑命名空间,所有的key都存储在同一个bitcask结构中。这就是为什么获得单个节点的唯一方法是编写一个工具来自己传输它们。
As far as I know, there's no automated solution to backup a single bucket in Riak. You'd have to use the
riak-admin
command line tool to take care of backing up a single physical node. You could write something to retrieve all keys in a single bucket and using low r values if you want it to be fast but not secure (r = 1).Buckets are a logical namespace, all of the keys are stored in the same bitcask structure. That's why the only way to get just a single node is to write a tool to stream them yourself.