使用 GridFS - 应该位于单独的数据库上吗?
我正在创建一个拥有大量音频存储(TB 级)的网站,并且我希望使用 GridFS 进行分片并能够轻松地跨多台计算机扩展数据库。
我的问题是,将文件放在单独的 mongo 数据库中会更好吗? mongodb 中会有大量文档,我只是不确定当您开始使用 GridFS 部分进行分片时会发生什么。
谢谢!
I am making a site that has a lot of audio storage, terabytes, and I was wanting to use GridFS for sharding and to be able to easily expand the database across multiple machines.
My question is that would it be better to put the files in a separate mongo database? There will be a good amount of documents in the mongodb, I just was not sure what happens when you start sharding with the GridFS portion.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
即使您将 GridFS 存储与其他集合保留在同一数据库中,当您需要转移到分片时,您仍然可以选择对哪些集合进行分片(或不分片)。也就是说,如果您将其放在单独的数据库中,那么您将能够更轻松地将其移动到单独的集群(如果您愿意的话)——例如,您可以为“主”集合拥有一个 3 分片集群,并且GridFS 的 5 分片集群(或您选择的任何其他配置)。
至于分片 GridFS 集合,请参阅 MongoDB 文档为 GridFS 选择一个分片键。通常,人们在
files_id
上对chunks
集合(文件数据本身存储的位置)进行分片,以便同一文件的所有块都驻留在同一分片上。同样,请参阅文档页面以获取更多详细信息。Even if you keep the GridFS storage in the same database as your other collections, you can still choose which collections to shard (or not) when you need to move to sharding. That said, if you have it in a separate database, you will be able to more easily move it to a separate cluster if you so choose -- so you could, for instance, have a 3 shard cluster for your "main" collections and a 5 shard cluster for GridFS (or any other configuration you choose).
As far as sharding GridFS collections, please see the MongoDB docs on choosing a shard key for GridFS. Commonly, people shard the
chunks
collection (which is where the file data itself is stored) onfiles_id
so that all chunks for the same file reside on the same shard. Again, please see the documentation page for more detail.