将数百个 Oracle 实例合并为一个实例的智慧
我们的应用程序运行在网络上,主要是一个查询工具,做一些交易。我们托管 Oracle 数据库。该应用程序始终为每个客户提供不同的 Oracle 实例。 客户是指向我们的员工(通常每个客户有 10,000-25,000 名员工)提供服务而向我们付费的公司。我们打算拥有数百个客户。我们每隔几年就会发布一次主要版本,迁移到新版本具有挑战性:我们可能会有一个团队在客户现场工作几周,解释新功能并设置驱动数据以适应该客户。
我们正在考虑采用多客户端,将所有客户置于大型 Windows Server 2008 服务器上的单个共享 Oracle 11g 实例中,以降低成本。我想知道这是否可取。
为每个客户提供单独的实例有一些优点。请告诉我这些实例是否是假的。我对重要性降低的粗略猜测是:
当对架构进行重大更改时,我们的客户 MyCorp 和 YourCo 可以单独迁移。 (对于多客户端,我们将在一夜之间迁移 300 多个客户!?!)
MyCorp 的数据可以轻松备份和(!!!)恢复,而不会影响其他客户。
MyCorp 的数据与竞争对手 YourCo 的数据安全分离,无需依赖开发人员来获得正确的代码和/或 DBA 获得正确的配置。
多个实例的风险较低,因为一个客户的灾难(有人不小心使每个人的工资翻倍,并且在发薪日后发现错误)不会影响其他客户。一场影响我们所有客户的灾难(哎呀,新的 DBA,突然间每个参与者都有相同的 SSN!?!)可能会让我们的公司陷入困境。
一台服务器上有一个实例会出现单点故障,如果飓风摧毁建筑物,我们的整个客户群就会停业。多台服务器上的多个实例允许地理分散:任何灾难都不会影响我们太大比例的客户,其他地区未受影响的服务器可以承担故障服务器的负载。
性能更好,因为数据库较小(约 50 个表中的 10,000 行与 2,000,000 行)。
如果 MyCorp 的办事处(大部分)仅位于一个区域,则 MyCorp 的实例可以在地理上位于该区域,因此网络延迟不会影响性能。出于同样的原因,我们可以为全球客户提供更好的服务。
在 MyCorp 中想要将他们的数据库放在内部,然后我们可以轻松导出他们的实例,以获取 MyCorp 他们的数据。
负载平衡更容易,因为实例可以放置在不同的服务器上(这是使用网络场)。
当需要 DEV 或 QA 实例时,克隆真实实例并对数据进行匿名化会更容易,因为数据要少得多。
由于它们足够小,开发人员可以在本地运行自己的实例,这样他们就可以在机场等待和飞行中处理代码,而无需担心 VPN 麻烦。
问题1:单独实例还有哪些其他优势?
我们正在考虑更改数据库架构并将所有客户合并到一个在一台大型服务器上运行的 Oracle 实例中。
以下是多客户端实例方法的优点,首先是最重要的(我的 WAG)。如果这些是假的,请予以批评:
DBA 的工作量会减少,因为他们只需要维护一个实例而不是数百。更少的 DBA 工作意味着更便宜,这是我们进行此更改的主要动机。
只需一个实例,DBA 就可以更好地优化性能。他们将有时间添加适当的索引并检查我们的 SQL。
开发者调试和调试会更加容易。增强应用程序,因为只有一个架构和一个应用程序(如果有数百个实例,则可能有几十个架构版本,每个版本的架构都有不同版本的应用程序)。这也降低了成本。另一种方法是必须在每次调试会话时都使用 (1) 该客户运行的版本以及 (2) 让我们努力重新创建相应的开发环境、代码和数据库。 (我们需要一个虚拟机,其中包含每个补丁和版本的代码和数据库实例!)
Oracle 许可更便宜,因为它是按每台服务器定价的,与重量无关(或者其他什么——我对这个主题一无所知) .
数据库成为 Web 会话数据的可行持久存储,因为只有一个实例。
使用一个多客户端实例可以更轻松地进行某些数据库操作,例如当参与者不清楚自己(或可能是他们的配偶)为哪个客户工作时找到参与者:所有姓名都在一张表中。跨客户报告非常简单。
问题 2:在一个实例中拥有多个客户端还有哪些其他优势?
问题3:您认为哪种方法更好(为什么)?每个客户的实例,还是一个实例中的所有客户?
我担心拥有一个多客户端实例会使迁移几乎不可能,这是一个交易杀手......
除非有一种折衷的解决方案,例如拥有两个多客户端实例,旧的和新的。在这种情况下,我们将设计用于查找参与者、报告等的跨实例解决方案,以便客户可以从一个多客户端实例转移到下一个,而不会出现任何中断。
Our application runs on the web, is mostly an inquiry tool, does some transactions. We host the Oracle database. The app has always had a different instance of Oracle for each customer. A customer is a company which pays us to provide our service to the company's employees, typically 10,000-25,000 employees per customer. We intend to have several hundred customers. We do a major release every few years, and migrating to that new release is challenging: we might have a team at the customer site for a couple weeks, explaining new functionality and setting up the driving data to suit that customer.
We're considering going multi-client, putting all our customers into a single shared Oracle 11g instance on a big honkin' Windows Server 2008 server -- in order to reduce costs. I'm wondering if that's advisable.
There are some advantages to having separate instances for each customer. Tell me if these are bogus, please. In my rough guess about decreasing importance:
Our customers MyCorp and YourCo can be migrated separately when breaking changes are made to the schema. (With multi-client, we'd be migrating 300+ customers overnight!?!)
MyCorp's data can be easily backed up and (!!!) restored, without affecting other customers.
MyCorp's data is securely separated from their competitor YourCo's data, without depending on developers to get the code right and/or DBAs getting the configuration right.
Multiple instances are lower risk, because a disaster with one customer (someone accidentally doubles everyone's salary and the error is discovered after pay day) doesn't affect other customers. A disaster that affected ALL our customers (whoops, new DBA, and suddenly every participant has the same SSN!?!) might put our company under.
Having one instance on one server presents a single point of failure, with our entire customer base out of business if a hurricane knocks the building over. Multiple instances on multiple servers permits geographic dispersion: no catastrophe will affect too large a proportion of our customers, and the unaffected servers in other regions can take on the load of the failed servers.
Performance is better because the database is smaller (10,000 vs 2,000,000 rows in ~50 tables).
If MyCorp's offices are (mostly) in just one region, then the MyCorp's instance can be geographically co-located there, so network lag doesn't hurt performance. We can provide better service to global clients, for the same reason.
In MyCorp wants to take their database in-house, then we can easily export their instance, to get MyCorp their data.
Load-balancing is easier because instances can be placed on different servers (this is with a web farm).
When a DEV or QA instance is needed, it's easier to clone the real instance and anonymize the data, because there's much less data.
Because they're small enough, developers can have their own instance running locally, so they can work on code while waiting at the airport and while in-flight, without fighting VPN hassles.
Q1: What are other advantages of separate instances?
We are contemplating changing the database schema and merging all of our customers into one Oracle instance, running on one hefty server.
Here are advantages of the multi-client instance approach, most important first (my WAG). Please snipe if these are bogus:
Less work for the DBAs, since they only need to maintain one instance instead of hundreds. Less DBA work translates to cheaper, our main motive for this change.
With just one instance, the DBAs can do a better job of optimizing performance. They'll have time to add appropriate indexes and review our SQL.
It will be easier for developers to debug & enhance the application, because there is only one schema and one app (there might be dozens of schema versions if there are hundreds of instances, with a different version of the app for each version of the schema). This reduces costs too. The alternative is having to start every debug session with (1) What version is this customer running and (2) Let's struggle to recreate the corresponding development environment, code and database. (We need a Virtual Machine that includes the code AND database instance for each patch and release!)
Licensing Oracle is cheaper because it's priced per server irrespective of heft (or something -- I don't know anything about the subject).
The database becomes a viable persistent store for web session data, because there is just one instance.
Some database operations are easier with one multi-client instance, like finding a participant when they're hazy about which customer they (or their spouse, maybe) works for: all the names are in one table. Reporting across customers is straightforward.
Q2: What are other advantages of having multiple clients in one instance?
Q3: Which approach do you think is better (why)? Instance per customer, or all customers in one instance?
I'm concerned that having one multi-client instance makes migration near-impossible, and that's a deal killer...
... unless there is a compromise solution like having two multi-client instances, the old and the new. In that case case, we would design cross-instance solutions for finding participants, reporting, etc. so customers could go from one multi-client instance to the next without anything breaking.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
也许值得研究 salesforce,而您正在寻找的流行词是“多租户架构”,
这值得一读:
http://blog.dayspring-tech.com/2009/02/forcecom-multitenant-architecture-under-the-covers/
这是一个很好的例子,因为 Salesforce 确实在幕后使用了 Oracle 数据库。
It may be worth researching salesforce, and the buzz word you're looking for is "multi tenant architecture"
This makes a good read:
http://blog.dayspring-tech.com/2009/02/forcecom-multitenant-architecture-under-the-covers/
It's a good example because Salesforce do use an Oracle db under the covers.
好问题,很高兴看到您正在考虑所有替代方案。有很多好的观点,但我只想谈一谈。
我是托管应用程序的 DBA,开发人员决定为此使用 Oracle 虚拟专用数据库功能。
该应用程序的构建目的是让客户共享一个应用程序服务器池以实现负载平衡,并在后端共享一个数据库模式。
在 VPD 之前,我们有一个 Java 类,它添加了“where customer_id=?”或“和 customer_id=?”在每个查询进入数据库之前,这样客户只能看到他们的数据。为了在登录数据库时在 VPD 中实现这一点,我们将让应用程序在应用程序上下文中设置一个变量,VPD 策略将使用该变量来允许会话仅查看其记录。所以,是的,您必须正确编码并将 VPD 策略分配给表,并且还要相信 Oracle 会履行他们的承诺。
那么这对我们有好处吗?理论上,将 SQL 谓词处理卸载到应用程序外部是件好事,但实际上,优点并没有超过缺点。
当我们在一个数据库中有数十个客户端时,当我们升级时,它们都必须同时升级。我们与客户进行了很多拉锯战,他们出于某种原因不想升级,或者想对新版本进行自己的质量检查。
我们接受了旧实例/新实例的升级,但迁移数据是有风险的,而且相关的停机时间并不能让客户满意。我们确实推出了自己的程序,可以逐步浏览表格并导出数据......但肯定不像快速导出或数据泵作业那么容易。
在分区方面,我们还遇到了 VPD 谓词分析的问题。与许多 Oracle 功能一样,它们本身可能可以正常工作,但一旦与其他功能结合使用,事情就会变得不可预测。对于我们来说,与当前 customer_id 无关的分区没有被消除,因为谓词分析在 SQL 语句的处理中来得太晚了。我们通过从静态 VPD 策略更改为动态 VPD 策略来解决这个问题,但我们花在解析上的时间却大幅增加。
那么毕竟我对此有何看法呢?我会花时间确保我们的应用程序充分利用绑定变量,并继续使用将 customer_id 添加到 SQL 语句的旧机制。
Good question, glad to see you are considering all the alternatives. Lots of good points but I will stick to just addressing one.
I was the DBA for a hosted application and the developers decided to use Oracle Virtual Private Database feature for this.
The application was constructed with intention of customers sharing a pool of app servers for load balancing and a single database schema on the back end.
Before VPD we had a Java class that tacked "where customer_id=?" or "and customer_id=?" on every query right before it went to the database so the customer would only see their data. To implement this in VPD upon login ot the DB we would have the app set a variable in the app context that would be used by the VPD policies to allow the session to only see their records. So yeah, you have to code it up right and assign VPD policies to tables, and also trust that Oracle holds up their end of the bargain.
So was it good for us? In theory it was nice to offload the SQL predicate handling to something outside our application but in practice the advantages didn't outweight the disadvantages.
When we had dozens of clients in one database and when we upgraded they all had to get upgraded at the same time. We had lots of tug-of-wars with customers that didn't want to upgrade for whatever reason or wanted to do their own QA on the new versions.
We entertained the Old instance/New instance thing for upgrades but migrating data was risky and associated downtime did not make customers happy. We did roll our own procedure that would step through tables and export data... But certainly not as easy as a quickie Export or Data Pump job.
We also had issues with VPD predicate analysis when it came to Partitioning. As with alot of Oracle features they may work OK on their own but once you combine with other features things get unpredictable. For us partitions not related to the current customer_id weren't getting eliminated because the predicate analysis was coming too late in the processing of the SQL statement. We worked around it by changing from static to dynamic VPD policies but our time spent parsing shot up.
So after all that what is my take on it? I would have spent the time making sure our app made good use of bind variables and continued with the old mechanism that added customer_id to the SQL statement.
Oracle 就是为了处理这种负载而设计的。
我的问题 - 当你有一千个客户并说一万个时你会怎么做?
您仍然保留单独的实例/架构吗?
我怀疑有人会这么做。我之前曾在一个地方工作过,每个客户都有单独的数据库以及位于中心位置的副本。
变更管理变得令人头疼,您必须维护有关哪个客户/公司使用哪个数据库修订版、模式、应用程序版本以及所有这些信息的非常好的信息。这本身就成为一个软件。
我建议围绕 SaaS 模型创建软件/设计,这将使您能够轻松维护并为所有用户提供相同的数据库/架构。
为了可靠性,您仍然可以使用集群 - Oracle RAC。
Oracle is made to handle that kind of load.
My Question - What do you do when you have thousand customers and say ten thousand?
Do you still keep separate instances/schema?
I doubt anyone will do that. I have worked earlier in a place where each client had separate database as well as a copy at a central place.
Change management becomes a headache, you'd have to maintain a very good information about which client/company is on which database revision, schema, app version and all those things. This'd become a software in itself.
I'd suggest to create software/design based around SaaS model, that'll allow you easy maintenance and same database/schema for all users.
For Reliability you can still use clustering - Oracle RAC.
我不得不多次考虑同一个决定。在我们的案例中,我们使用 MySQL,因此在单独的数据库中运行所有客户不会产生任何成本。
在单独的数据库上运行所有客户的好处是巨大的。我们有一个脚本,可以让我们将客户的整个实例移动到任何服务器以平衡负载。该脚本仅复制数据库、复制任何自定义文件、启动应用程序并设置我们的路由系统以将用户发送到新实例。整个过程只需几分钟。
在大型 mysql 数据库上,数据库更改可能需要很长时间。由于我们所有的客户都有自己的数据库,因此我们能够保持所有数据集较小。备份也非常快。
我们的开发实例的行为方式相同,因此这种方法允许我们在开发和测试新功能时同时运行各种数据库模式。我们经常与客户合作,让他们尝试新功能,然后再将其部署到我们的其余实例。我们坚持的一条规则(为了避免您提到的一些缺点)是所有客户端都必须位于彼此的一个版本内。跨客户端维护多个版本将产生巨大的开销。
Facebook 在创办公司时也采取了同样的方法。他们启动的每所学校都有一个单独的数据库,他们能够非常快速地建立新实例。他们最终整合数据库的主要原因是他们希望使用户能够在学校之间进行交流。
如果不是因为潜在的成本问题,我肯定会鼓励您坚持使用单独的数据库方法。
I've had to consider the same decision a few times. In our case we use MySQL, so there is no cost associated with running all customers in a separate database.
The benefits to running all of our customers on a separate database have been great. We have a script that lets us move a customer's entire instance to any server to balance load. The script merely copies over the database, copies over any custom files, spins up the application, and sets up our routing system to send users to the new instance. The whole process takes just a few minutes.
Database changes can take a very long time on large mysql databases. Since all our clients have their own database we are able to keep all of our datasets small. Backups are also very fast.
Our development instances behave the same way, so this method allows us to run a variety of database schemas simultaneously as we develop and test new features. We often work with customers to have them try out a new feature before we deploy it to the rest of our instances. The one rule that we stick to (in order to avoid a few of the drawbacks you mention), is that all clients must be within one version of each other. Maintaining more than a couple versions across clients would have a huge overhead.
Facebook took the same approach when they started their company. Each school that they launched at had a separate database and they were able to set up new instances very quickly. The primary reason they finally consolidated their database was that they wanted to enable users to communicate between schools.
If not for potential cost issues I would definitely encourage you to stick with the separate database approach.
除非您使用 Oracle XE(有限的免费版本),否则即使您购买的是单核、单 CPU 设备,每台服务器只有一个数据库很快就会变得非常昂贵。每台服务器拥有多个数据库效率很低,因为每个数据库都会产生 CPU 和 RAM 使用开销。调优更加困难,因为争用更难诊断。
因此,除了更易于管理之外,单个大型服务器应该比许多离散的小型服务器更便宜(没有保证,不退款!)。确保您购买最大、最快的芯片以及尽可能多的可用插槽内存。这些可以为您提供更好的性能,而不影响您的许可成本。
如果您负担得起,请考虑分区选项。这将解决您对备份和恢复的担忧,因为每个分区都可以有自己的表空间。因此(按 client_id 进行分区)可以备份或恢复单个客户端的数据而不影响其他客户端。我们甚至可以导出和导入单个分区。我对 David 的观察结果感到惊讶,即分区修剪不适用于 VPD。但我还没有尝试过这个组合,所以我相信他的话。
整合可能会失去的一件事是在不同版本的应用程序上支持不同客户端的能力。然而,这并不一定是坏事。正如您所观察到的,如果您放弃应用程序的个性化版本,维护数百个客户将会容易得多。如果您确实需要提供一些定制功能 - 即使您只想使用单个客户端对某些功能进行 Beta 测试 - 那么请查看 11gR2 中基于版本的重新定义:这是一个非常漂亮的功能。此外,它还适用于所有 Oracle 许可证,而不仅仅是企业版。
Unless you are using Oracle XE (the limited, free edition) having one database per server will get very expensive very quickly, even if you're buying single core, single CPU boxes. Having several databases per server is inefficient, because each database incurs an overhead of CPU and RAM usage. Tuning is more difficult, because contention is harder to diagnose.
So, as well as being easier to administer, a single big server ought to work out cheaper than lots of discrete little servers (no guarantees, no money back!). Make sure you buy the biggest, fastest chips you can and as much RAM as you have free slots. Those are things which give you better performance without affecting your licensing costs.
Consider the Partitioning option, if you can afford it. This will address your concerns regarding backup and recovery, because each partition can have its own tablespace. So (given partitioning by client_id) it becomes possible to backup or restore an individual client's data without affecting the other clients. We can even export and import individual partitions. I'm surprised by David's observation that Partition pruning didn't work with VPD. But I haven't tried this combo, so I'll take his word for it.
The one thing you might lose from consolidation is the ability to support different clients on different versions of your application. However, this is not necessarily a bad thing. As you observe, maintaining several hundred customers will be a lot easier if you forgo individualised versions of the application. If you do need to offer some bespoke features - even if you just want to beta test some functionality with an individual client - then have a look at Edition-Based Redefinition in 11gR2: it is a really nifty feature. Also it is available for all Oracle licenses, not just Enterprise.
当您说“单独的实例”时,您是在谈论一个具有多个架构的实例吗?或者您真的是指在一台机器上运行多个实例吗?没有什么理由在一台机器上运行多个实例,而不是在一个实例上运行多个模式 - 每个模式仍然有自己的一组表、索引等。
无论如何,我没有完整的答案,但需要记住的一件事是 Oracle 的许可成本,以及它如何影响最佳解决方案。
根据 Oracle 商店的信息,
因此,例如,如果您需要8 个四核 CPU 可以处理 100 个客户,单个数据库上的许可比拥有 4 个独立数据库(每个数据库有 2 个四核 CPU,每个数据库运行 25 个客户)要昂贵得多。
8 个四核 CPU 需要企业版,标价为 16 x 47,500 美元 = 760,000 美元。 4 台机器,每台运行标准版 1,每台都有 2 个四核 CPU,其标价为 8 x 5,800.00 美元 = 46,400 美元 - 相差 16 倍。现在,请记住,没有人为企业版支付标价,但仍然存在巨大差异需要考虑。
如果您对跨客户端的数据库操作没有巨大的需求,并且不需要企业版功能,并且需要此级别的 CPU 能力(或预计将增长到需要此级别的 CPU 能力),则许可成本这将是单一实例方法的一个巨大缺点。
When you say 'separate instances', are you talking about one instance with multiple schemas on it? Or do you really mean multiple instances running on a single machine? There is little reason to run multiple instances on a single machine, as opposed to running multiple schemas on a single instance - each schema would still have their own set of tables, indexes, etc.
Anyways, I don't have a full answer, but one thing to keep in mind is the licensing costs of Oracle, and how that can affect what the optimal solution is.
According to the Oracle store,
So if, for example, you need 8 quad core CPUs to handle 100 customers, licensing that on a single database is VASTLY more expensive than having 4 separate databases, each having 2 quad core CPUs, each running 25 customers.
8 quad core CPUs requires enterprise edition, and would have a list price of 16 x $47,500 = $760,000. 4 machines, each running standard edition one, and each with 2 quad-core CPUS, would have a list price of 8 x $5,800.00 = $46,400 - a factor of 16 difference. Now, keep in mind that no one pays list price for enterprise edition, but there is still a huge difference to consider.
If you don't have a huge need for database operations across clients, and you don't need enterprise edition features, and you need this level of CPU power (or expect to grow to need this level of CPU power), the licensing costs are going to be a huge downside of the one-instance approach.