如何使用 MongoDB 设计用于多租户分析的集合?
我看到了几篇关于 Mongo 多租户方法的帖子,但希望我能够针对更具体的要求获得更具体的反馈。
以下是我从业务方面了解到的:
- “免费评估”:应该允许新租户(客户)免费、轻松地注册;其中许多将在系统中以较低的活动和数量停留很长时间。
- [每个租户] 分析是该解决方案的关键组成部分。并不迫切需要实时分析,并且大部分分析可以通过每个租户的“批”处理来完成。
- 可能还需要一些跨租户“内部”分析处理,但需要在稍后阶段进行。
我认为新租户的“轻量级”分配的需要与“每个租户数据库”方法不一致。
假设我在同一个数据库中为不同的租户保留单独的集合:
- 为每个租户的集合安排如此多的映射减少聚合是否有效(而不是对多租户集合进行“一次大”扫描)
- 是否有任何实用的方法跨多个集合执行分析计算?
- 除了集合和索引总数的限制之外,大量集合是否还存在其他问题?比如,失去了对一些旨在与固定集合集一起使用的库和工具的支持?
或者,在管理“固定集”多租户集合时:
- 集合结构的“最佳实践”是什么?我是否应该在多租户集合中的每个文档中“通过引用”保留“租户文档”?
- 在这种情况下运行map-reduce的“最佳实践”是什么?我应该尝试一个巨大的 Map-Reduce,还是运行多个 Map-Reduce 任务来过滤每个租户的集合?
谢谢, 最大限度
I saw several posts regarding multi-tenant approaches with Mongo, but hope I can get more specific feedback for more specific requirements.
Here is what I know from the business side:
- "Free evaluation": should allow free and easy registration of a new tenant (customer); many of them will stay in the system with low activity and volume, and for long.
- Analytics [per tenant] is a key component of the solution. There is no acute need for real-time analytics, and most of it may be done in "batch" processing per tenant.
- Some cross-tenant "internal" analytic processing may also be needed, but at later stages.
I assume that the need for "lightweight" allocation of a new tenant- is not consistent with the "database per tenant" approach.
Assuming I keep separate collections for separate tenants, in the same DB:
- Would it be efficient to schedule so many map-reduce aggregations per each tenant's collections (as opposed to "one big" scan of multi-tenant collections)
- Is there any practical way to perform analytic calculations across multiple collections?
- Are there any other issues with large number of collections, beyond the limit on the total number of collections and indexes? Like, losing support of some libraries and tools designed to work with a fixed collections set?
Alternatively, when managing the "fixed set" multi-tenant collections:
- What are the "best practices" for the collections structure? Should I keep the "tenant document" "by reference" in each and every document within a multi-tenant collection?
- What are the "best practices" to run map-reduce in such a case? Should I try one huge map-reduce, or running multiple map-reduce tasks filtering the collections per tenant?
Thanks,
Max
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论