为每个客户端使用单一数据库有哪些优点?

发布于 2024-07-05 13:44:44 字数 537 浏览 8 评论 0原文

在为多个客户端设计的以数据库为中心的应用程序中,我一直认为对所有客户端使用单个数据库“更好”——将记录与正确的索引和键相关联。 在收听 Stack Overflow 播客时,我听到 Joel 提到 FogBugz 每个客户端使用一个数据库(因此,如果有 1000 个客户端,就会有 1000 个数据库)。 使用这种架构有什么优点?

据我了解,对于某些项目,客户需要直接访问其所有数据 - 在这样的应用程序中,显然每个客户都需要自己的数据库。 但是,对于客户端不需要直接访问数据库的项目,每个客户端使用一个数据库有什么优势吗? 就灵活性而言,使用单个数据库和单个表副本似乎要简单得多。 添加新功能更容易,创建报告更容易,而且管理也更容易。

我对“适用于所有客户端的一个数据库”方法非常有信心,直到我听到 Joel(一位经验丰富的开发人员)提到他的软件使用了不同的方法 - 我对他的决定有点困惑......

我听说过人们认为数据库在处理大量记录时会变慢,但是任何具有一定优点的关系数据库都不会出现这个问题 - 特别是如果使用正确的索引和键的话。

任何意见都将不胜感激!

In a database-centric application that is designed for multiple clients, I've always thought it was "better" to use a single database for ALL clients - associating records with proper indexes and keys. In listening to the Stack Overflow podcast, I heard Joel mention that FogBugz uses one database per client (so if there were 1000 clients, there would be 1000 databases). What are the advantages of using this architecture?

I understand that for some projects, clients need direct access to all of their data - in such an application, it's obvious that each client needs their own database. However, for projects where a client does not need to access the database directly, are there any advantages to using one database per client? It seems that in terms of flexibility, it's much simpler to use a single database with a single copy of the tables. It's easier to add new features, it's easier to create reports, and it's just easier to manage.

I was pretty confident in the "one database for all clients" method until I heard Joel (an experienced developer) mention that his software uses a different approach -- and I'm a little confused with his decision...

I've heard people cite that databases slow down with a large number of records, but any relational database with some merit isn't going to have that problem - especially if proper indexes and keys are used.

Any input is greatly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

眼泪都笑了 2024-07-12 13:44:44

这是我以前见过的一种方法:

  • 每个客户都有一个存储在主客户数据库中的唯一连接字符串。
  • 数据库的设计使得所有内容都按 CustomerID 进行分段,即使数据库上只有一个客户。
  • 创建脚本以将所有客户数据迁移到新数据库(如果需要),然后只需更新该客户的连接字符串以指向新位置。

这允许首先使用单个数据库,然后一旦您拥有大量客户,或更常见的是当您有几个过度使用系统的客户时,可以轻松地进行分段。

我发现,当所有数据都位于同一个数据库中时,恢复特定客户数据确实很困难,但管理升级要简单得多。

当每个客户使用单个数据库时,您会遇到一个巨大的问题,即让所有客户运行在相同的架构版本上,而且这甚至没有考虑对一大堆特定于客户的数据库进行备份作业。 当然,恢复数据会更容易,但如果您确保不永久删除记录(只需标记已删除标志或移动到存档表),那么您一开始就不需要数据库恢复。

Here's one approach that I've seen before:

  • Each customer has a unique connection string stored in a master customer database.
  • The database is designed so that everything is segmented by CustomerID, even if there is a single customer on a database.
  • Scripts are created to migrate all customer data to a new database if needed, and then only that customer's connection string needs to be updated to point to the new location.

This allows for using a single database at first, and then easily segmenting later on once you've got a large number of clients, or more commonly when you have a couple of customers that overuse the system.

I've found that restoring specific customer data is really tough when all the data is in the same database, but managing upgrades is much simpler.

When using a single database per customer, you run into a huge problem of keeping all customers running at the same schema version, and that doesn't even consider backup jobs on a whole bunch of customer-specific databases. Naturally restoring data is easier, but if you make sure not to permanently delete records (just mark with a deleted flag or move to an archive table), then you have less need for database restore in the first place.

删除→记忆 2024-07-12 13:44:44

假设将所有客户端存储在一个数据库中不会造成扩展损失; 对于大多数人来说,以及配置良好的数据库/查询,现在这将是相当正确的。 如果您不是这些人中的一员,那么单一数据库的好处是显而易见的。

在这种情况下,好处来自于每个客户端的封装。 从代码的角度来看,每个客户端都是孤立存在的 - 数据库更新不可能覆盖、损坏、检索或更改属于另一个客户端的数据。 这也简化了模型,因为您无需考虑记录可能属于另一个客户端的事实。

您还可以获得可分离性的好处 - 提取与给定客户端关联的数据并将它们移动到不同的服务器非常简单。 或者,当调用“我们已经删除了一些关键数据!”时,使用内置数据库机制恢复该客户端的备份。

您可以获得轻松且免费的服务器移动性 - 如果您扩展一台数据库服务器,您可以在另一台服务器上托管新客户端。 如果它们全部位于一个数据库中,则您需要获得更强大的硬件,或者在多台计算机上运行数据库。

您可以轻松进行版本控制 - 如果一个客户端想要保留软件版本 1.0,而另一个客户端想要 2.0,其中 1.0 和 2.0 使用不同的数据库模式,那么没有问题 - 您可以迁移一个客户端,而无需将它们从一个数据库中取出。

我想我还能想到几十个。 但总而言之,关键概念是“简单”。 该产品管理一个客户端,从而管理一个数据库。 “但是数据库还包含其他客户端”问题从来没有任何复杂性。 它符合用户单独存在的心理模型。 能够同时对所有客户进行简单报告之类的优势是微不足道的 - 您多久需要一份关于整个世界而不仅仅是一个客户的报告?

Assume there's no scaling penalty for storing all the clients in one database; for most people, and well configured databases/queries, this will be fairly true these days. If you're not one of these people, well, then the benefit of a single database is obvious.

In this situation, benefits come from the encapsulation of each client. From the code perspective, each client exists in isolation - there is no possible situation in which a database update might overwrite, corrupt, retrieve or alter data belonging to another client. This also simplifies the model, as you don't need to ever consider the fact that records might belong to another client.

You also get benefits of separability - it's trivial to pull out the data associated with a given client ,and move them to a different server. Or restore a backup of that client when the call up to say "We've deleted some key data!", using the builtin database mechanisms.

You get easy and free server mobility - if you outscale one database server, you can just host new clients on another server. If they were all in one database, you'd need to either get beefier hardware, or run the database over multiple machines.

You get easy versioning - if one client wants to stay on software version 1.0, and another wants 2.0, where 1.0 and 2.0 use different database schemas, there's no problem - you can migrate one without having to pull them out of one database.

I can think of a few dozen more, I guess. But all in all, the key concept is "simplicity". The product manages one client, and thus one database. There is never any complexity from the "But the database also contains other clients" issue. It fits the mental model of the user, where they exist alone. Advantages like being able to doing easy reporting on all clients at once, are minimal - how often do you want a report on the whole world, rather than just one client?

别理我 2024-07-12 13:44:44

至于一次升级 1000 个数据库服务器的痛苦,一些相当简单的自动化应该可以解决。 只要每个数据库都维护相同的模式,那么它就不会真正成为问题。 我们还使用每个客户端数据库的方法,它对我们来说效果很好。

这里有一篇关于这个主题的文章(是的,它是 MSDN,但它是一篇独立于技术的文章): http://msdn.microsoft.com/en-us/library/aa479086.aspx

关于多租户的另一个讨论,因为它与您的数据模型相关:http://www.ayende.com/Blog/archive/2008/08/07/Multi-Tenancy--The-Physical-Data-Model.aspx

As for the pain of upgrading 1000 database servers at once, some fairly simple automation should take care of that. As long as each database maintains an identical schema, then it won't really be an issue. We also use the database per client approach, and it works well for us.

Here is an article on this exact topic (yes, it is MSDN, but it is a technology independent article): http://msdn.microsoft.com/en-us/library/aa479086.aspx.

Another discussion of multi-tenancy as it relates to your data model here: http://www.ayende.com/Blog/archive/2008/08/07/Multi-Tenancy--The-Physical-Data-Model.aspx

执着的年纪 2024-07-12 13:44:44

为了保持简单。 您可以确定您的客户只能看到他们的数据。 记录较少的客户不必支付与数据库中可能存在但不是他们自己的数十万条记录竞争的惩罚。 我不在乎所有内容的索引和优化程度如何,都会有查询确定它们必须扫描每条记录。

To keep it simple. You can be sure that your client is only seeing their data. The client with fewer records doesn't have to pay the penalty of having to compete with hundreds of thousands of records that may be in the database but not theirs. I don't care how well everything is indexed and optimized there will be queries that determine that they have to scan every record.

谎言 2024-07-12 13:44:44

那么,如果您的一位客户由于导入作业或类似情况而要求您恢复到其数据的早期版本,该怎么办? 想象一下,如果您告诉客户“您不能这样做,因为您的数据在我们所有客户之间共享”或“抱歉,您的更改丢失了,因为客户 X 要求恢复数据库”,他们会有什么感受。

Well, what if one of your clients tells you to restore to an earlier version of their data due to some botched import job or similar? Imagine how your clients would feel if you told them "you can't do that, since your data is shared between all our clients" or "Sorry, but your changes were lost because client X demanded a restore of the database".

半窗疏影 2024-07-12 13:44:44

“数据库”有多种含义,

  • 硬件盒,
  • 运行软件(例如“oracle”),
  • 特定的数据文件集,
  • 特定的登录名或模式,

Joel 可能指的是较低层之一。 在这种情况下,这只是软件配置管理的问题......例如,您不必修补 1000 个软件服务器来修复安全错误。

我认为这是一个好主意,这样软件错误就不会在客户端之间泄露信息。 想象一下有一个错误的 where 子句的情况,该子句向我显示了您的客户数据以及我自己的客户数据。

There are a couple of meanings of "database"

  • the hardware box
  • the running software (e.g. "the oracle")
  • the particular set of data files
  • the particular login or schema

It's likely Joel means one of the lower layers. In this case, it's just a matter of software configuration management... you don't have to patch 1000 software servers to fix a security bug, for example.

I think it's a good idea, so that a software bug doesn't leak information across clients. Imagine the case with an errant where clause that showed me your customer data as well as my own.

清君侧 2024-07-12 13:44:44

在医疗保健等受监管行业中,可能需要每个客户一个数据库,甚至可能需要一个单独的数据库服务器。

升级时更新多个数据库的简单答案是将升级作为事务进行,并在必要时在升级之前拍摄快照。 如果您的操作运行良好,那么您应该能够将升级应用到任意数量的数据库。

集群并不是真正解决索引和全表扫描问题的方法。 如果迁移到集群,变化很小。 如果您有许多较小的数据库要分布在多台计算机上,则无需集群即可更便宜地完成此操作。 可靠性和可用性是考虑因素,但可以通过其他方式处理(有些人仍然需要集群,但大多数人可能不需要)。

我很想听听您对此的更多背景信息,因为集群不是一个简单的主题,而且在 RDBMS 世界中实现起来成本高昂。 在非关系世界中,有很多关于集群的讨论/虚张声势,例如 Google Bigtable 等,但它们正在解决一组不同的问题,并且失去了 RDBMS 的一些有用功能。

In regulated industries such as health care it may be a requirement of one database per customer, possibly even a separate database server.

The simple answer to updating multiple databases when you upgrade is to do the upgrade as a transaction, and take a snapshot before upgrading if necessary. If you are running your operations well then you should be able to apply the upgrade to any number of databases.

Clustering is not really a solution to the problem of indices and full table scans. If you move to a cluster, very little changes. If you have have many smaller databases to distribute over multiple machines you can do this more cheaply without a cluster. Reliability and availability are considerations but can be dealt with in other ways (some people will still need a cluster but majority probably don't).

I'd be interested in hearing a little more context from you on this because clustering is not a simple topic and is expensive to implement in the RDBMS world. There is a lot of talk/bravado about clustering in the non-relational world Google Bigtable etc. but they are solving a different set of problems, and lose some of the useful features from an RDBMS.

舞袖。长 2024-07-12 13:44:44

可扩展性。 安全。 我们公司也采用每个客户 1 DB 的方法。 它还使代码更容易维护。

Scalability. Security. Our company uses 1 DB per customer approach as well. It also makes code a bit easier to maintain as well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文