我应该如何处理需要被多个应用程序引用的共享数据库?
在我的 SOA 中,我有 2 项服务 - 用户服务和产品服务。用户和产品都可以用两个对象“标记”——国家和行业。这意味着这两个服务都将有一个连接表,并且未来的服务也将需要它。我希望国家和行业的数据库能够标准化,并在可能的情况下从一个地方进行管理。我能想到的有几个选择:
- 保留国家、行业和 其他共享数据库本身 服务器并允许外部只读 连接,同时操纵 数据必须由一个人完成 其唯一目的是管理的应用程序 那个数据。
- 将这些表的副本保存在 服务本地的数据库,以及 让它们充当从表。这 主表将由以下人员维护 管理该数据的应用程序 向这些从站推送更新 表。
我是否缺少任何好的选择?在这两个或任何其他提议中,您会选择哪一个,为什么?
In my SOA, I have 2 services - a users service and products service. Both users and products can be "tagged" with 2 objects - countries and industries. This means that both services will have a join table and future services will need that as well. I would like the database of countries and industries to be standard and managed from one place if possible. There are a few options I can think of:
- Keep the countries, industries, and
other shared databases on it's own
server and allow external read only
connections while manipulating the
data would have to be done by one
app whose sole purpose is to manage
that data. - Keep copies of those tables in a
database local to the service, and
have them act as slave tables. The
master tables will be maintained by
an app that manages that data and
pushes out updates to those slave
tables.
Am I missing any good options? Out of those 2 or any other proposed, which would you go with and why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您太担心用户和产品表的引用完整性,我会在多个数据库中保留这些表的只读副本并建立 FK。然后编写一个服务来使所有副本保持最新。
否则,您在选项 1 中提到的中心辐射型解决方案就可以完成这项工作。但您需要以编程方式控制用户和产品表的数据输入,以建立良好的数据质量。
If you worry too much for referential integrity of your User and Product tables, I would keep a read only copy of those tables in multiple databases and establish FKs. Then write a single service to keep all the copies up to date.
Otherwise, hub-spoke solution you mentioned in option 1 would do the job. But you need to control data entry to your user and product tables programmatically to establish good data quality.
您可能需要研究复制。根据您的描述,快照复制似乎是最合适的。您将拥有一个数据库,用于维护公用表(使用专用应用程序或直接 SQL 或其他方式),然后您的服务数据库将成为订阅者。 SQL Server 将负责在服务器之间复制数据。
您甚至可以为公用表设置外键(因为它们出现在每个服务数据库中),
这实际上是您的选项 2,但填写了“如何同步主表和从表”位。
You might want to look into Replication. From your description, Snapshot replication would seem the most suitable. You would have one database where you maintain your common tables (using a dedicated app, or direct SQL, or whatever), and then your service databases would be subscribers. SQL Server would take care of copying the data around between the servers.
You can even then have Foreign Keys to the common tables (because they appear in each of the service databases)
Which is effectively your option 2, but with the "how to synchronize the master and slave tables" bit filled in.
使用一个包含所有表的数据库,然后打开两个服务到该数据库的连接,编写干净的事务并让 ACID 属性处理任何(不太可能的)问题?
简单、高效、容易做。
是否存在会阻止此解决方案工作的要求?
Use one database with all the tables, then open a connection from both services to it, write clean transactions and let ACID properties handle any (unlikely) problems?
Simple, efficient, easy to do.
Is there a requirement that would prevent this solution from working?
复制和数据缓存是非常有趣的事情之一,但很少是正确的选择。数据库旨在处理共享数据,并且大多数数据库都非常擅长于此。仅当数据满足特定要求时,才应保留数据的多个副本,例如:
如果没有特殊要求,就听Kdansky的吧。保持简单。花费尽可能多的开发时间为用户提供功能......而不是编写数据完整性监视器。
Replication and data caching is one of those things that is really interesting to talk about but rarely the right choice. Databases are designed to deal with shared data and most are quite good at it. You should only keep multiple copies of the data if it addresses a specific requirement such as:
If you don’t have a special requirement, listen to Kdansky. Keep it simple. Spend as much of your development time providing your users with features... not writing data integrity monitors.
现在,您基本上是在比较简单性与性能(因为您提到引用完整性不是问题)。在选项 1 中,所有内容都集中定位和管理,需要额外的网络跃点来提取数据,但您不必担心过时的数据或管理可能存在问题的多个数据库。如果您的主服务器正常工作,那么它适用于所有人。在选项 2 中,在本地存储数据会提高性能,但会使系统的长期维护变得复杂,并产生潜在的数据不一致问题(尽管考虑到您的数据是国家和行业,它可能不会经常更改,因此不需要会经常更新)。
我的建议是构建更简单的解决方案(选项 1),然后在需要时调整性能(使用选项 2 或可能的其他解决方案)。在调整性能时,请在添加本地数据库之前查看服务器之间的延迟,您也许可以通过索引或统计信息来解决问题。
Right now you are basically comparing simplicity versus performance (since you mentioned referential integrity isn't an issue). In option 1 everything is centrally located and managed, it will require additional network hops to pull the data but you don't have to worry about stale data or managing multiple databases which could have their own issues. If your main server works then it works for everyone. In option 2 storing your data locally will improve your performance but it will complicate long term maintenance of the system and create potential data inconsistency problems (although given your data is countries and industries it probably won't change very often so it shouldn't need to be updated frequently).
My recommendation would be to build the simpler solution (option 1) and then if the need arises tune for performance (using option 2 or possibly some other solution). When tuning for performance take a look at the latency between the servers before you add local databases, you may be able to fix the problem with indices or statistics.