MongoDB 中多租户数据库的推荐方法是什么?

发布于 2024-08-30 14:20:06 字数 226 浏览 9 评论 0原文

我正在考虑使用 MongoDB 创建一个多租户应用程序。我还没有猜测我会有多少租户,但我希望能够扩展到数千个。

我可以想到三种策略:

  1. 所有租户都在同一个集合中,使用特定于租户的字段来确保安全性
  2. 1 每个租户在单个共享数据库中进行集合 每个
  3. 租户 1 个数据库

我脑海中的声音建议我选择选项 2

。影响,有人吗?

I'm thinking of creating a multi-tenant app using MongoDB. I don't have any guesses in terms of how many tenants I'd have yet, but I would like to be able to scale into the thousands.

I can think of three strategies:

  1. All tenants in the same collection, using tenant-specific fields for security
  2. 1 Collection per tenant in a single shared DB
  3. 1 Database per tenant

The voice in my head is suggesting that I go with option 2.

Thoughts and implications, anyone?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

舞袖。长 2024-09-06 14:20:06

我有同样的问题需要解决,并且也在考虑变体。
由于我拥有多年创建 SaaS 多租户应用程序的经验,因此我还将根据我之前使用关系数据库的经验选择第二个选项。

在进行研究时,我在 mongodb 支持网站上找到了这篇文章(自从它消失后就添加了):
https://web .archive.org/web/20140812091703/http://support.mongohq.com/use-cases/multi-tenant.html

这些人表示要不惜一切代价避免第二种选择,据我了解,这并不是特别的特定于 mongodb。我的印象是,由于数据库设计的特殊性,这适用于我研究的大多数 NoSQL 数据库(CoachDB、Cassandra、CouchBase Server 等)。

集合(或存储桶,或者无论他们在不同的数据库中如何称呼它)与 RDBMS 中的安全模式不同,尽管它们充当文档的容器,但它们对于应用良好的租户分离毫无用处。我找不到可以基于集合应用安全限制的 NoSQL 数据库。

当然,您可以使用基于 mongodb 角色的安全性来限制数据库/服务器级别的访问。 (http://docs.mongodb.org/manual/core/authorization/)

在以下情况下,我会推荐第一个选项:

  • 您有足够的时间和资源来处理问题的复杂性
    该场景的设计、实现和测试。
  • 如果结构和结构上不会有太大差异
    数据库中针对不同租户的功能。
  • 您的应用程序设计将允许租户只赚最少的钱
    运行时的自定义。
  • 如果您想优化空间并尽量减少硬件的使用
    资源。
  • 如果您将有数千名租户。
  • 如果您想以良好的成本快速扩展。
  • 如果您不打算根据租户备份数据(单独保存
    每个租户的备份)。即使在这种情况下也可以做到这一点
    但付出的努力将是巨大的。

如果满足以下条件,我会选择方案 3:

  • 您的租户名单很小(几百个)。
  • 业务的具体情况要求您能够支持不同租户的数据库结构的巨大差异(例如与第三方系统的集成、数据的导入导出)。
  • 您的应用程序设计将允许客户(租户)在应用程序运行时进行重大更改(添加模块、自定义字段等)。
  • 如果您有足够的资源来快速扩展新的硬件节点。
  • 如果您需要保留每个租户的数据版本/备份。恢复也会很容易。
  • 法律/监管限制迫使您将不同的租户保留在不同的数据库(甚至数据中心)中。
  • 如果你想充分利用 mongodb 开箱即用的安全功能,例如角色。
  • 租户之间的规模存在很大差异(有很多小租户,很少有非常大的租户)。

如果您发布有关您的申请的更多详细信息,也许我可以为您提供更详细的建议。

I have the same problem to solve and also considering variants.
As I have years of experience creating SaaS multi-tenant applicatios I also was going to select the second option based on my previous experience with the relational databases.

While making my research I found this article on mongodb support site (way back added since it's gone):
https://web.archive.org/web/20140812091703/http://support.mongohq.com/use-cases/multi-tenant.html

The guys stated to avoid 2nd options at any cost, which as I understand is not particularly specific to mongodb. My impression is that this is applicable for most of the NoSQL dbs I researched (CoachDB, Cassandra, CouchBase Server, etc.) due to the specifics of the database design.

Collections (or buckets or however they call it in different DBs) are not the same thing as security schemas in RDBMS despite they behave as container for documents they are useless for applying good tenant separation. I couldn't find NoSQL database that can apply security restrictions based on collections.

Of course you can use mongodb role based security to restrict the access on database/server level. (http://docs.mongodb.org/manual/core/authorization/)

I would recommend 1st option when:

  • You have enough time and resources to deal with the complexity of the
    design, implementation and testing of this scenario.
  • If you are not going to have much differences in structure and
    functionality in the database for different tenants.
  • Your application design will allow tenants to make only minimal
    customizations at runtime.
  • If you want to optimize space and minimize usage of hardware
    resources.
  • If you are going to have thousands of tenants.
  • If you want to scale out fast and at good cost.
  • If you are NOT going to backup data based on tenants (keep separate
    backups for each tenant). It is possible to do that even in this
    scenario but the effort will be huge.

I would go for variant 3 if:

  • You are going to have small list of tenants (several hundred).
  • The specifics of the business requires you to be able to support big differences in the database structure for different tenants (e.g. integration with 3rd-party systems, import-export of data).
  • Your application design will allow customers (tenants) to make significant changes in the application runtime (adding modules, customizing the fields etc.).
  • If you have enough resources to scale out with new hardware nodes quickly.
  • If you are required to keep versions/backups of data per tenant. Also the restore will be easy.
  • There are legal/regulatory restrictions that forces you to keep different tenants in different databases (even data centers).
  • If you want to fully utilize the out-of-the-box security features of mongodb such as roles.
  • There are big differences in matter of size between tenants (you have many small tenants and few very large tenants).

If you post additional details about your application, perhaps I can give you more detailed advice.

看海 2024-09-06 14:20:06

我在此链接的评论中找到了一个很好的答案:

http://blog.boxedice.com/2010/02/28/notes-from-a-development-mongodb-deployment/

基本上选项#2似乎是最好的方法。

引用 David Mytton 的评论:

我们决定不建立数据库
客户因为 MongoDB 的方式
分配其数据文件。每个
数据库使用它自己的文件集:

<块引用>

数据库的第一个文件是
dbname.0,然后 dbname.1,等等 dbname.0
将为 64MB,dbname.1 128MB,等等
至 2GB。一旦文件达到 2GB
大小,每个连续文件也是
2GB。

因此,如果最后一个数据文件是
比如说 1GB,该文件可能 90% 是空的
如果最近达到的话。

来自手册。

当用户注册试用并给予
事情继续下去,我们会得到越来越多
至少 2GB 的数据库
大小,即使整个数据
文件未被使用。我们发现这使用了
相比之下,大量的磁盘空间
为所有人拥有多个数据库
磁盘空间可以的客户
用于最大效率。

分片将针对每个集合
基础作为标准,提出了
集合永远不会出现的问题
达到启动的最小尺寸
分片,就像相当多的情况一样
我们的一些(例如仅收集
存储用户登录详细信息)。然而,
我们已要求这也将
能够在每个数据库上完成
等级。看
http://jira.mongodb.org/browse/SHARDING-41

没有性能权衡
使用大量集合。看
http://www.mongodb.org/display/DOCS /使用+a+Large+Number+of+集合

I found a good answer in the comments in this link:

http://blog.boxedice.com/2010/02/28/notes-from-a-production-mongodb-deployment/

Basically option #2 seems to be the best way to go.

Quote from David Mytton's comment:

We decided not to have a database per
customer because of the way MongoDB
allocates its data files. Each
database uses it’s own set of files:

The first file for a database is
dbname.0, then dbname.1, etc. dbname.0
will be 64MB, dbname.1 128MB, etc., up
to 2GB. Once the files reach 2GB in
size, each successive file is also
2GB.

Thus if the last datafile present is
say, 1GB, that file might be 90% empty
if it was recently reached.

from the manual.

As users sign up to the trial and give
things a go, we’d get more and more
databases that were at least 2GB in
size, even if the whole of the data
file wasn’t use. We found this used a
massive amount of disk space compared
to having several databases for all
customers where the disk space can be
used to maximum efficiency.

Sharding will be on a per collection
basis as standard which presents a
problem where the collection never
reaches the minimum size to start
sharding, as is the case for quite a
few of ours (e.g. collections just
storing user login details). However,
we have requested that this will also
be able to be done on a per database
level. See
http://jira.mongodb.org/browse/SHARDING-41

There are no performance tradeoffs
using lots of collections. See
http://www.mongodb.org/display/DOCS/Using+a+Large+Number+of+Collections

凉风有信 2024-09-06 14:20:06

我会选择选项 2。

但是您可以设置 mongod.exe 命令行选项 --smallfiles。这意味着一个盘区的最大文件大小将为 0.5 GB,而不是 2 GB。我用 mongo 1.42 对此进行了测试。所以方案3也不是不可能。

I would go for option 2.

However you could set mongod.exe command line option --smallfiles. This means that the biggest file size of an extent will be 0.5 gigabyte and not 2 gigabyte. I tested this with mongo 1.42 . So option 3 is not impossible.

叶落知秋 2024-09-06 14:20:06

MSDN 上有一篇关于多租户数据架构的合理文章,其中你不妨参考一下。本文涉及的一些关键主题:

  • 经济考虑因素
  • 安全
  • 考虑因素 租户考虑因素
  • 监管(法律)
  • 技能组考虑因素

还涉及软件即服务 (SaaS) 配置的一些模式。

此外,值得一看的是 来自 SQL Anywhere 人员的有趣文章

我个人的看法 - 除非您确信强制安全/信任,否则我会选择选项 3,或者如果可扩展性问题至少禁止回退到选项 2。也就是说...我不是 MongoDB 专业人士。我对使用共享的“模式”感到非常紧张 - 但我会很乐意遵循更有经验的从业者。

There is a reasonable article on MSDN about multi-tenant data architecture which you might wish to refer to. Some key topics touched on by this article:

  • Economic considerations
  • Security
  • Tenant considerations
  • Regulatory (legal)
  • Skill set concerns

Also touched upon are some patterns for Software as a Service (SaaS) configuration.

Additionally, worth a gander is an interesting write-up from the SQL Anywhere guys.

My own personal take - unless you are certain of enforced security / trust, I would go with option 3, or if scalability concerns prohibit fallback to option 2 at a minimum. That said... I'm no pro with MongoDB. I get pretty nervous using a shared "schema" - but I will happily defer to more experienced practitioners.

ぃ弥猫深巷。 2024-09-06 14:20:06

根据我对 MongoDB 的研究。 Trucos y consejos。多租户应用程序。
如果您不知道可以拥有多少个租户,则不建议使用该选项,它可能是数千个,并且在分片方面会很复杂,还可以想象在一个数据库中有数千个集合......所以在您的情况下建议使用选项一。现在,如果您的用户数量有限,那么情况已经不同了,是的,您可以按照您的想法使用选项二。

According to my research in MongoDB. Trucos y consejos. Aplicaciones multitenant.
that option is not recommended if you do not know how many tenants you can have, it could be thousands and it would be complicated when it comes to sharding, also imagine having thousands of collections in a single database ... So in your case it is recommended to use option one. Now if you are going to have a limited number of users, it is already different and yes, you could use option two as you thought.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文