在 Riak 中存储二进制数据的缺点?
在 Riak 中存储二进制数据存在哪些问题(如果有)?
它会影响集群的可维护性和性能吗?
使用 Riak 与分布式文件系统相比,性能有何差异?
What are the problems, if any, of storing binary data in Riak?
Does it effect the maintainability and performance of the clustering?
What would the performance differences be between using Riak for this rather than a distributed file system?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
除了@Oscar-Godson 的出色答案之外,您可能会遇到远大于 50MB 的值的问题。 Bitcask 最适合高达几 KB 的值。如果您要存储较大的值,您可能需要考虑替代存储后端,例如 innostore 。
我没有存储二进制值的经验,但我们在生产中有一个中等规模的集群(5 个节点,大约 100M 值,10 TB),并且我们经常看到与插入和检索值相关的错误大小为 100 KB。在这种情况下,性能不一致 - 有时有效,有时则无效 - 因此,如果您要进行测试,请进行大规模测试。
在运行 Map-Reduce 查询时,我们还发现了大值的问题 - 它们只是超时。然而,这可能与二进制值不太相关......(正如@Matt-Ranney 提到的)。
另请参阅@Stephen-C的答案此处
Adding to @Oscar-Godson's excellent answer, you're likely to experience problems with values much larger than 50MBs. Bitcask is best suited for values that are up to a few KBs. If you're storing large values, you may want to consider alternative storage backends, such as innostore.
I don't have experience with storing binary values, but we've a medium-sized cluster in production (5 nodes, on the order of 100M values, 10's of TBs) and we're seeing frequent errors related to inserting and retrieving values that are 100's of KBs in size. Performance in this case is inconsistent - some times it works, others it doesn't - so if you're going to test, test at scale.
We're also seeing problems with large values when running map-reduce queries - they simply time out. However that may be less relevant to binary values... (as @Matt-Ranney mentioned).
Also see @Stephen-C's answer here
我能想到的唯一问题是存储大于 50MB 的二进制数据,他们建议不要这样做。 Riak 的全部要点就是:
资料来源:Riak 中的架构设计 - 简介
The only problem I can think of is storing binary data larger than 50MBs which they advise against. The whole point of Riak is just that:
Source: Schema Design in Riak - Introduction
对于 Riak,建议每个对象的最大大小为 2MB。除此之外,建议使用 Riak CS,它已使用高达 5TB 的对象进行了测试(在 Riak 中存储为 1MB 对象),或者自然地将大对象分解为 2MB 块并通过键和后缀链接。
With Riak, the recommended maximum is 2MB per object. Above that, it's recommended to use either Riak CS, which has been tested with objects up to 5TB (Stored in Riak as 1MB objects) or by naturally breaking up your large object into 2MB chunks and linking by a key and suffix.
我个人没有注意到将图像和文档(DOC 和 PDF)等数据存储到 Riak 中存在任何问题。我没有性能数据,但如果我记得的话也许可以收集一些。
值得注意的是,在 Riak 中,您可以使用 Luwak,它提供了用于存储大文件的 api。这非常有用。
I personally haven't noticed any issues storing data such as images and documents (both DOC and PDF) into Riak. I don't have performance numbers but might be able to gather some should I remember.
Something of note, with Riak you can use Luwak which provides an api for storing large files. This has been pretty useful.
一个问题可能是,在二进制数据中使用 JavaScript 映射/归约即使不是不可能,也是很困难的。为此你可能需要 Erlang。
One problem may be that it is difficult, if not impossible, to use JavaScript map/reduce across your binary data. You'll probably need Erlang for that.