SOA 和共享数据库
我不懂SOA(面向服务的架构)和数据库。虽然我被 SOA 概念(将可重用的业务逻辑封装到服务中)所吸引,但我无法弄清楚如果其他服务/系统需要封装在服务中的数据表,或者 SOA 是否合适,那么它应该如何工作<在这种情况下根本没有吗?
更具体地说,假设我有两个服务:
CustomerService
:包含我的Customers
数据库表和关联的业务逻辑。OrderService
:包含我的Orders
表和逻辑。
现在,如果我需要使用 SQL 语句JOIN
Customers
和 Orders
表怎么办?如果表包含数百万个条目,如果我必须使用 SOAP/XML 通过网络发送数据,将会导致不可接受的性能。以及如何执行JOIN
?
做了一些研究,我发现了一些建议的解决方案:
- 使用复制 制作本地副本在需要的地方提供所需的数据。但是如果没有封装,那么使用SOA还有什么意义呢? StackOverflow对此进行了讨论,但尚未达成明确的共识。
- 设置封装所有数据库数据的主数据服务。我猜它会变得巨大(基本上每个存储过程都有一个 API 调用)并且需要一直更新。对我来说,这似乎与企业数据总线概念有关。
如果您对此有任何意见,请告诉我。
I don't understand SOA (Service-oriented Architecture) and databases. While I'm attracted by the SOA concept (encapsulating reusable business logic into services) I can't figure out how it's supposed to work if data tables encapsulated in a service are required by other services/systems---or is SOA suitable at all in this scenario?
To be more concrete, suppose I have two services:
CustomerService
: contains myCustomers
database table and associated business logic.OrderService
: contains myOrders
table and logic.
Now what if I need to JOIN
the Customers
and Orders
tables with an SQL statement? If the tables contain millions of entries, unacceptable performance would result if I have to send the data over the network using SOAP/XML. And how to perform the JOIN
?
Doing a little research, I have found some proposed solutions:
- Use replication to make a local copy of the required data where needed. But then there's no encapsulation and then what's the point of using SOA? This is discussed on StackOverflow but there's no clear consensus.
- Set up a Master Data Service which encapsulates all database data. I guess it would get monster sized (with essentially one API call for each stored procedure) and require updates all the time. To me this seems related to the enterprise data bus concept.
If you have any input on this, please let me know.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在这种情况下,“服务”的定义原则之一是它绝对拥有其负责的区域中的数据以及对该数据的操作。
通过复制或任何其他机制复制数据就放弃了这种责任。要么您也复制业务规则,要么最终会遇到需要更新其他服务来更改内部规则的情况。
使用单一数据服务只是“不做SOA”;如果您有一个位置来管理所有数据,那么您就没有独立的服务,而只有一项服务。
相反,我建议使用第三种选择:使用组合将数据组合在一起,完全避免数据库级别的 JOIN 操作。
不要考虑需要在数据库中将这两个值连接在一起,而是考虑如何在边缘将它们组合在一起:
当您为客户呈现 HTML 页面时,您可以从多个服务提供 HTML 并将它们彼此组合在一起直观上:客户详细信息来自客服,订单详细信息来自订单服务。
同样,发票电子邮件:以可视方式编写从多个服务提供的数据,无需数据库内连接。
这样做有两个好处:一、不需要加入数据库,甚至不需要将数据存储在同一类型的数据库中。现在,每个服务都可以使用最适合其需求的任何数据存储。
第二,您可以更轻松地更改应用程序的外部。如果您有小型可组合部件,您可以轻松添加以新方式重新排列部件。
One of the defining principals of a "service" in this context is that it owns, absolutely, that data in the area it is responsible for, as well as operations on that data.
Copying data, through replication or any other mechanism, ditches that responsibility. Either you replicate the business rules, too, or you will eventually wind up in a situation where you wind up needing the other service updated to change your internal rules.
Using a single data service is just "don't do SOA"; if you have one single place that manages all data, you don't have independent services, you just have one service.
I would suggest, instead, the third option: use composition to put that data together, avoiding the database level JOIN operation entirely.
Instead of thinking about needing to join those two values together in the database, think about how to compose them together at the edges:
When you render an HTML page for a customer, you can supply HTML from multiple services and compose them next to each other visually: the customer details come from the customer service, and the order details from the order service.
Likewise an invoice email: compose data supplied from multiple services visually, without needing the in-database join.
This has two advantages: one, you do away with the need to join in the database, and even the need to have the data stored in the same type of database. Now each service can use whatever data store is most appropriate for their need.
Two, you can more easily change the outside of your application. If you have small, composable parts you can easily add rearrange the parts in new ways.
指导原则是缓存不可变数据是可以的
这意味着来自客户实体的简单不可变数据可以存在于订单服务中,并且无需每次需要信息时都去客户服务处。将所有内容分解为独立的服务,然后始终进行这些远程过程调用会忽略分布式计算的谬误。
如果您有广泛的报告需求,则需要创建额外的服务。我称之为聚合报告服务,它再次获取只读数据以用于报告目的。您可以参阅几年前我为 InfoQ 写过的一篇文章
The guiding principle is that it is ok to cache immutable data
This means that simple immutable data from the customer entity can exist in the order service and there's no need to go to the customer service every time you need the info. Breaking everything to isolated services and then always making these remote procedure calls ignores the fallacies of distributed computing.
If you have extensive reporting needs you need to create an additional service. I call that Aggregated Reporting service, which, again gets read-only data for reporting purposes. You can see an article I wrote about that for InfoQ a few years ago
在您引用的 SO 问题中,很多人都表示一个服务可以访问另一个服务数据,因此 Order 服务可以具有 GetAllWithCustomer 功能,该功能将返回所有订单以及该订单的客户详细信息。
另外,我的这个问题可能会有所帮助:
https://softwareengineering.stackexchange。 com/questions/115958/is-it-bad-practice-for-services-to-share-a-database-in-soa
In the SO question you quoted, various people state that it is OK for a service to access another services data, so the Order service could have a GetAllWithCustomer functionality, which would return all the orders along with the customer details for that order.
Also, this question of mine may be helpful:
https://softwareengineering.stackexchange.com/questions/115958/is-it-bad-practice-for-services-to-share-a-database-in-soa