NoSQL 数据库和许多半大型 blob
是否有适合存储大量(即> 10 亿)“中等大小”blob(即20 KB 到2 MB)的NoSQL(或其他类型)数据库。我所需要的只是从 A(标识符)到 B(blob)的映射、在给定 A 的情况下检索“B”的能力、用于访问的一致外部 API,以及“仅添加另一台计算机”来扩展系统的能力。
比数据库更简单的东西,例如分布式键值系统,可能就很好,并且我也很欣赏沿着这种思路的任何想法。
感谢您的阅读。
布莱恩
Is there a NoSQL (or other type of) database suitable for storing a large number (i.e. >1 billion) of "medium-sized" blobs (i.e. 20 KB to 2 MB). All I need is a mapping from A (an identifier) to B (a blob), the ability to retrieve "B" given A, a consistent external API for access, and the ability to "just add another computer" to scale the system.
Something simpler than a database, e.g. a distributed key-value system, may just fine, and I'd appreciate any thoughts along that vein as well.
Thank you for reading.
Brian
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您的 API 需求纯粹是“Get(key)、Put(key,blob)、Remove(key)”,那么键值存储(或更准确地说是“持久分布式哈希表”)正是您所需要的。正在寻找。
其中有相当多的可用,但如果没有额外的信息,很难提出可靠的建议 - 您的目标操作系统是什么?您使用哪种语言进行开发?您的应用程序的 I/O 特征是什么(冷/不可变数据,例如图像?高写入负载又名推文?)
一些值得研究的 KV 系统:
- MemcacheDB
- Berkeley DB
- Voldemort
您可能还想查看文档存储,例如 CouchDB 或 RavenDB*。文档存储与 KV 存储类似,但它们了解持久性格式(通常是 JSON),因此可以提供索引等附加服务。
If your API requirements are purely along the lines of "Get(key), Put(key,blob), Remove(key)" then a key-value store (or more accurately a "Persistent distributed hash table") is exactly what you are looking for.
There a quite a few of these available, but without additional information it is hard to make a solid recommendation - What OS are you targeting? Which language(s) are you developing with? What are the I/O characteristics of your app (cold/immutable data such as images? high write loads aka tweets?)
Some of the KV systems worth looking into:
- MemcacheDB
- Berkeley DB
- Voldemort
You may also want to look into document stores such as CouchDB or RavenDB*. Document Stores are similar to KV stores but they understand the persistence format (usually JSON) so they can provide additional services such as indexing.
Jackrabbit 怎么样?
当我使用 Liferay CMS 时,我认识了 Jackrabbit。 Liferay 使用 Jackrabbit 来实现其文档库。它将用户文件存储在服务器的文件系统中。
What about Jackrabbit?
I knew Jackrabbit when I worked with Liferay CMS. Liferay uses Jackrabbit to implement its Document Library. It stores user files in the server's file system.
您还需要查看 Riak。 Riak 非常专注于完全按照您的要求进行操作(只需添加节点,易于访问)。
You'll also want to take a look at Riak. Riak is very focused on doing exactly what you're asking (just add node, easy to access).