组合基于关系和文档的“数据库”
我正在开发一个系统,主要涉及媒体归档、搜索、上传、分发以及处理 BLOB。
我目前正在尝试找出处理 BLOB 的最佳方法。 我对具有大量内存和大磁盘的高端服务器的资源有限,但我可以访问大量中等性能的现成计算机并将它们连接到互联网。
因此,我决定不将 BLOB 存储在中央关系数据库中,因为在最坏的情况下,我将拥有一个非常重的数据库实例,可能在一台普通机器上。 不是一个选择。
将 BLOB 作为文件直接存储在文件系统上并将其路径存储在数据库中也有点难看,并且必须手动管理分发,自己跟踪不同的副本。 我什至不想接近那个。
我研究了 CouchDB,我真的很喜欢他们基于点对点的设计。 这将允许我在互联网上运行分布式机器集群,这意味着:
- 低成本硬件
- 开箱即用的用于冗余和故障转移的
- 分发轻量级 REST 接口
因此,如果我做对了,人们可以这样总结它:云像 API 和自我管理、分布式、复制系统
系统的其余部分执行任何普通 Web 应用程序所做的正常工作:处理会话、安全性、用户、搜索等。 对于这一部分,我仍然想使用关系数据模型。 (CouchDB 声称不会成为关系数据库的替代品)。
因此,我将拥有所有标准数据,包括关系数据库中的 BLOB 元数据,以及 CouchDB 中的 BLOB 本身。
您认为这种方法有问题吗? 我错过了什么重要的事情吗? 你能想出更好的解决方案吗?
谢谢你!
I am developing a system that is all about media archiving, searching, uploading, distributing and thus about handling BLOBs.
I am currently trying to find out the best way how to handle the BLOB's. I have limited resources for high end servers with a lot of memory and huge disks, but I can access a large array of medium performance off-the-shelf computers and hook them to the Internet.
Therefore I decided to not store the BLOBs in a central Relational Database, because I would then have, in the worst case, one very heavy Database Instance, possibly on a single average machine. Not an option.
Storing the BLOBs as files directly on the filesystem and storing their path in the database is also somewhat ugly and distribution would have to be managed manually, keeping track of the different copies myself. I don't even want to get close to that.
I looked at CouchDB and I really like their peer-to-peer based design. This would allow me to run a distributed cluster of machines across the Internet, implies:
- Low cost Hardware
- Distribution for Redundancy and Failover out of the box
- Lightweight REST Interface
So if I got it right, one could summarize it like this: Cloud like API and self managed, distributed, replicated system
The rest of the system does the normal stuff any average web application does: handling session, security, users, searching and the like. For this part I still want to use a relational datamodel. (CouchDB claims not to be a replacement for relational databases).
So I would have all the standard data, including the BLOB's meta data in the relational database but the BLOBs themselves in CouchDB.
Do you see a problem with this approach? Am I missing something important? Can you think of better solutions?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以尝试将 Amazon 的关系数据库 SimpleDB 和 S3 与 SimpleJPA 结合使用。 SimpleJPA 是 SimpleDB 之上的 JPA 实现。 SimpleJPA 使用 SimpleDB 作为关系结构,使用 S3 来存储 BLOB。
You could try Amazon's relational database SimpleDB and S3 toghether with SimpleJPA. SimpleJPA is a JPA-implementation on top of SimpleDB. SimpleJPA uses SimpleDB for the relational structure and S3 to store BLOBs.
看看 MongoDB,它支持以高效的格式存储二进制数据,并且速度快得令人难以置信
Take a look at MongoDB, it supports storing binary data in an efficient format and is incredibly fast
没问题。 我做了一个与那个非常相似的设计。 您可能还想了解一下 HBase,将其作为 CouchDB 和自适应对象模型架构模式的替代方案,作为管理数据和元数据的一种方式。
No problem. I have done a design very similar to that one. You may also want to take a peek to HBase as an alternative to CouchDB and to the Adaptive Object-Model architectural pattern, as a way to manage your data and meta-data.