在 Riak 中存储二进制数据的缺点?

发布于 2024-11-09 13:53:15 字数 94 浏览 6 评论 0原文

在 Riak 中存储二进制数据存在哪些问题(如果有)?

它会影响集群的可维护性和性能吗?

使用 Riak 与分布式文件系统相比,性能有何差异?

What are the problems, if any, of storing binary data in Riak?

Does it effect the maintainability and performance of the clustering?

What would the performance differences be between using Riak for this rather than a distributed file system?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

眼泪也成诗 2024-11-16 13:53:15

除了@Oscar-Godson 的出色答案之外,您可能会遇到远大于 50MB 的值的问题。 Bitcask 最适合高达几 KB 的值。如果您要存储较大的值,您可能需要考虑替代存储后端,例如 innostore

我没有存储二进制值的经验,但我们在生产中有一个中等规模的集群(5 个节点,大约 100M 值,10 TB),并且我们经常看到与插入和检索值相关的错误大小为 100 KB。在这种情况下,性能不一致 - 有时有效,有时则无效 - 因此,如果您要进行测试,请进行大规模测试。

在运行 Map-Reduce 查询时,我们还发现了大值的问题 - 它们只是超时。然而,这可能与二进制值不太相关......(正如@Matt-Ranney 提到的)。

另请参阅@Stephen-C的答案此处

Adding to @Oscar-Godson's excellent answer, you're likely to experience problems with values much larger than 50MBs. Bitcask is best suited for values that are up to a few KBs. If you're storing large values, you may want to consider alternative storage backends, such as innostore.

I don't have experience with storing binary values, but we've a medium-sized cluster in production (5 nodes, on the order of 100M values, 10's of TBs) and we're seeing frequent errors related to inserting and retrieving values that are 100's of KBs in size. Performance in this case is inconsistent - some times it works, others it doesn't - so if you're going to test, test at scale.

We're also seeing problems with large values when running map-reduce queries - they simply time out. However that may be less relevant to binary values... (as @Matt-Ranney mentioned).

Also see @Stephen-C's answer here

爱人如己 2024-11-16 13:53:15

我能想到的唯一问题是存储大于 50MB 的二进制数据,他们建议不要这样做。 Riak 的全部要点就是:

人们可能选择 Riak 的另一个原因是
以便灵活地对数据进行建模。
Riak 将存储您告诉它的任何数据
以一种与内容无关的方式——确实如此
不强制执行表、列或
参照完整性。这意味着你
可以存储二进制文件并排
对程序员更透明的格式
例如 JSON 或 XML。

资料来源:Riak 中的架构设计 - 简介

The only problem I can think of is storing binary data larger than 50MBs which they advise against. The whole point of Riak is just that:

Another reason one might pick Riak is
for flexibility in modeling your data.
Riak will store any data you tell it
to in a content-agnostic way — it does
not enforce tables, columns, or
referential integrity. This means you
can store binary files right alongside
more programmer-transparent formats
like JSON or XML.

Source: Schema Design in Riak - Introduction

夜清冷一曲。 2024-11-16 13:53:15

对于 Riak,建议每个对象的最大大小为 2MB。除此之外,建议使用 Riak CS,它已使用高达 5TB 的对象进行了测试(在 Riak 中存储为 1MB 对象),或者自然地将大对象分解为 2MB 块并通过键和后缀链接。

With Riak, the recommended maximum is 2MB per object. Above that, it's recommended to use either Riak CS, which has been tested with objects up to 5TB (Stored in Riak as 1MB objects) or by naturally breaking up your large object into 2MB chunks and linking by a key and suffix.

无语# 2024-11-16 13:53:15

我个人没有注意到将图像和文档(DOC 和 PDF)等数据存储到 Riak 中存在任何问题。我没有性能数据,但如果我记得的话也许可以收集一些。

值得注意的是,在 Riak 中,您可以使用 Luwak,它提供了用于存储大文件的 api。这非常有用。

I personally haven't noticed any issues storing data such as images and documents (both DOC and PDF) into Riak. I don't have performance numbers but might be able to gather some should I remember.

Something of note, with Riak you can use Luwak which provides an api for storing large files. This has been pretty useful.

傻比既视感 2024-11-16 13:53:15

一个问题可能是,在二进制数据中使用 JavaScript 映射/归约即使不是不可能,也是很困难的。为此你可能需要 Erlang。

One problem may be that it is difficult, if not impossible, to use JavaScript map/reduce across your binary data. You'll probably need Erlang for that.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文