具有“现代”特征的存储库模式数据访问策略

发布于 2024-10-01 16:42:35 字数 2562 浏览 3 评论 0原文

因此，当我发现我看待问题的整个方式完全颠倒了时，我在网络上搜索寻找使用多个数据存储实现存储库模式时的最佳实践。这就是我所拥有的...

我的应用程序是一个 BI 工具，从（截至目前）四个不同的数据库中提取数据。由于内部限制，我目前正在使用 LINQ-to-SQL 进行数据访问，但需要一种允许我更改为实体框架或 NHibernate 或下一次数据访问的设计。我还坚持使用 IoC 框架（在本例中为温莎城堡）在我的应用程序中解耦层。

因此，我使用存储库模式从业务层抽象出实际的数据访问代码。因此，我的业务对象是针对某些 IRepository 接口进行编码的，并且 IoC 容器用于管理实际的实现。在这种情况下，我希望有一个具体的 LinqRepository，它使用 LINQ-to-SQL 实现接口来完成工作。后来我可以用 EF存储库替换它，而无需对我的业务层进行任何更改。

另外，因为我是针对接口进行编码的，所以我可以轻松地模拟存储库以进行单元测试。

因此，当我开始编写应用程序时，我遇到的第一个问题是我是否应该为每个 DataContext 或每个实体拥有一个存储库（正如我通常所做的那样）？假设一个数据库包含具有预期关系的客户和销售人员。我应该拥有一个具有与两个实体一起使用的方法的 OrderTrackingRepository 还是拥有一个单独的 CustomerRepository 和不同的 SalesRepository？

其次，作为 BI 工具，主要界面用于报告、图表等，并且通常需要跨多个来源的数据“混搭”。例如，现实情况是，一个数据库包含客户信息，另一个数据库处理销售信息，第三个数据库保存其他财务信息，但我的要求之一是显示涵盖这三个数据库的聚合信息。另外，我必须在用户界面中支持动态过滤。显然，直接针对 LINQ-to-SQL 或 EF DataContext 对象（例如 Table）进行工作将允许我几乎做任何事情。使用存储库接口抽象 DAL 时，向我的业务逻辑公开相同功能的最佳方法是什么？

这篇文章：链接文本表明 EF4 已经扭转了这种方法，并且存储库什么也不是不仅仅是从 EF DataContext 返回的 IQueryable，这会带来一系列其他问题。

但是，我想我已经聊得够多了...

更新（谢谢，史蒂文！）

好的，让我举一个更具体的（至少对我来说）例子并澄清一些问题希望这些观点能够引导我更好地思考一种方法。

虽然我理解史蒂文的提议，但在实现这些事情时我必须考虑一个开发团队，我担心他们会迷失在复杂性中（是的，这里是一个真正的问题！）。

因此，让我们删除与 Linq-to-Sql 的任何直接搭配，因为我不想要一个依赖于 L2S 工作方式的解决方案，甚至是 EF 的工作方式。我的目的是抽象出正在使用的数据访问技术，以便我可以根据需要更改它，而不需要对业务层中的使用代码进行附带更改。我过去通过向业务层提供 IRepository 接口来完成此任务。也许这些应该被命名为 IUnitOfWork 或者更符合我的喜好，IDataService，但目标是相同的。这些接口通常公开诸如 Add、Remove、Contains 和 GetByKey 等方法。

这是我的情况。我有三个数据库可供使用。一个是 DB2，包含客户（特许经营权）的所有业务信息，例如他们的信息及其产品、订单等。另一个是 SQL Server 数据库，包含他们的财务历史记录，而第三个 SQL Server 数据库包含特定于应用程序的信息。前两个数据库由多个应用程序共享。

通过我的应用程序，客户可以输入/上传给定时间段内的财务信息。输入后，我必须执行以下步骤：

1.根据一组静态规则验证输入的数据。例如，数据必须包含合法的客户 ID 值（在上传的情况下）。这需要在 DB2 数据库中进行查找，以验证提供的客户 ID 是否存在并且是最新的。 2.接下来，我必须根据第三个（SQL Server）数据库中包含的一组动态规则来验证数据。一个示例可能是给定值不能超过另一个值的特定百分比。 3.验证后，我将数据保存到包含财务数据的第二个 SQL Server 数据库中。一直以来，我的代码必须具有松散耦合的依赖关系，因此我可以在单元测试中模拟它们。

作为分析的一部分，我知道我要使用三个不同的数据存储以及我正在使用的大约六个实体（此时）。一般来说，我认为我的应用程序中将具有三个 DataContext，每个数据存储一个，其中实体由适当的数据上下文公开。

然后，我可以为每个实体创建一个单独的 I{存储库|工作单元|服务}，该实体将由我的业务逻辑使用，并通过一个知道要使用哪个数据上下文的具体实现。但这似乎是一个有风险的提议，因为实体数量增加，单个存储库|UoW|服务类型的数量也增加。

然后，以我的验证逻辑为例，该逻辑适用于多个实体，从而适用于多个数据上下文。我不确定这是最有效的方法。

我尚未提及的另一个要求是在报告方面，我需要在数据存储上执行一些复杂的查询。截至目前，这些查询一次仅限于单个数据存储，但我可能需要能够将多个来源的数据混合在一起。

最后，我正在考虑将前两个（共享）数据库的所有数据访问内容提取到自己的项目中，并一直将 WCF 数据服务视为一种可能的方法。这将为我为任何使用此数据的应用程序提供一致的方法奠定基础。

这如何改变你的想法？

原文

So I was searching the web looking for best practices when implementing the repository pattern with multiple data stores when I found my entire way of looking at the problem turned upside down. Here's what I have...

My application is a BI tool pulling data from (as of now) four different databases. Due to internal constraints, I am currently using LINQ-to-SQL for data access but require a design that will allow me to change to Entity Framework or NHibernate or the next data access du jour. I also hold steadfast to decoupled layers in my apps using an IoC framework (Castle Windsor in this case).

As such, I've used the Repository pattern to abstract the actual data access code from my business layer. As a result, my business object is coded against some I<Entity>Repository interface and the IoC Container is used to manage the actual implementation. In this case, I would expect to have a concrete Linq<Entity>Repository that implements the interface using LINQ-to-SQL to do the work. Later I could replace this with an EF<Entity>Repository with no changes required to my business layer.

Also, because I'm coding against the interface, I can easily mock the repository for unit testing purposes.

So the first question that I have as I begin coding the application is whether I should have one repository per DataContext or per entity (as I've typically done)? Let's say one database contains Customers and Sales with the expected relationship. Should I have a single OrderTrackingRepository with methods that work with both entities or have a separate CustomerRepository and a different SalesRepository?

Next, as a BI tool, the primary interface is for reporting, charting, etc and often will require a "mashup" of data across multiple sources. For instance, the reality is that one database contains customer information while another handles sales information and a third holds other financial information but one of my requirements is to display aggregated information that spans all three. Plus, I have to support dynamic filtering in the UI. Obviously working directly against the LINQ-to-SQL or EF DataContext objects (Table<Entity>, for instance) will allow me to pretty much do anything. What's the best approach to expose that same functionality to my business logic when abstracting the DAL with a repository interface?

This article: link text indicates that EF4 has turned this approach around and that the repository is nothing more than an IQueryable returned from the EF DataContext which brings up a whole other set of questions.

But, I think I've rambled on enough...

UPDATE (Thanks, Steven!)

Okay, let me put a more tangible (for me, at least) example on the table and clarify a few points that will hopefully lead to an approach I can better wrap my head around.

While I understand what Steven has proposed, I have a team of developers I have to consider when implementing such things and I'm afraid they will get lost in the complexity (yes, a real problem here!).

So, let's remove any direct tie-in with Linq-to-Sql because I don't want a solution that is dependant upon the way L2S works - or even EF, for that matter. My intent has been to abstract away the data access technology being used so that I can change it as needed without requiring collateral changes to the consuming code in my business layer. I've accomplished this in the past by presenting the business layer with IRepository interfaces to work against. Perhaps these should have been named IUnitOfWork or, more to my liking, IDataService, but the goal is the same. These interfaces typically exposed methods such as Add, Remove, Contains and GetByKey, for example.

Here's my situation. I have three databases to work with. One is DB2 and contains all of the business information for a customer (franchise) such as their info and their Products, Orders, etc. Another, SQL Server database contains their financial history while a third SQL Server database contains application-specific information. The first two databases are shared by multiple applications.

Through my application, the customer may enter/upload their financial information for a given time period. When entered, I have to perform the following steps:

1.Validate the entered data against a set of static rules. For example, the data must contain a legitimate customer ID value (in the case of an upload). This requires a lookup in the DB2 database to verify that the supplied customer ID exists and is current.
2.Next I have to validate the data against a set of dynamic rules which are contained in the third (SQL Server) database. An example may be that a given value cannot exceed a certain percentage of another value.
3.Once validated, I persist the data to the second SQL Server database containing the financial data.
All the while, my code must have loosely-coupled dependencies so I may mock them in my unit tests.

As part of the analysis, I know that I have three distinct data stores to work with and about a half-dozen or so entities (at this time) that I am working with. In generic terms, I presume that I would have three DataContexts in my application, one per data store, with the entities exposed by the appropriate data context.

I could then create a separate I{repository|unit of work|service} for each entity that would be consumed by my business logic with a concrete implementation that knows which data context to use. But this seems to be a risky proposition as the number of entities increases, so does the number of individual repository|UoW|service types.

Then, take the case of my validation logic which works with multiple entities and, thereby, multiple data contexts. I'm not sure this is the most efficient way to do this.

The other requirement that I have yet to mention is on the reporting side where I will need to execute some complex queries on the data stores. As of right now, these queries will be limited to a single data store at a time, but the possibility is there that I might need to have the ability to mash data together from multiple sources.

Finally, I am considering the idea of pulling out all of the data access stuff for the first two (shared) databases into their own project and have been looking at WCF Data Services as a possible approach. This would give me the basis for a consistent approach for any application making use of this data.

How does this change your thinking?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

秋叶绚丽 2024-10-08 16:42:35

对于您的情况，我建议您返回 IEnummerables 来进行存储库的数据查询。我通常通过代表域问题并封装业务逻辑的服务类来聚合来自多个存储库的调用。为了保持干净，我尝试将我的重现集中在域问题上。我将我的 Datacontext 比作一个存储库，并使用 T4 模板提取接口，以便更轻松地进行模拟。但是没有什么可以阻止您使用封装您的调用的传统存储库。这样做将允许您在任何阶段切换 ORM。

编辑： IQueryable 不是答案！ :-)
我在这方面也做了很多工作，最初得出了相同的结论，但这不是一个好的解决方案。 Repo 的重点是将查询抽象为离散的工作块。公开 IQueryable 过于临时，并且会在以后引发一些问题。你失去了扩展的能力。您失去了优化查询的能力（假设我想转向高度优化的存储过程）。您失去了使用 IoC 进行存储库切换数据访问层（将项目从 SQL 切换到 Mongo）的能力。您失去了在 Repo 中提供有效数据缓存的能力（这是 Repo 模式的主要优势）。我建议仔细看看为什么我们有回购模式。它不仅仅是一个“ORM”映射层。让我真正明白这一点的是 CQRS 模式。

除此之外，允许 IQueryable 的临时性质会导致查询的重用不合适。重用查询通常不是一个好主意，因为在查询之间您会看到轻微的偏差，最终会产生两个副产品：查询变得过于广泛且效率低下。查询中充满了无法维护的 IF THEN 语句，以迎合偏差。

IQueryable 很简单，但会给你带来难以维护的混乱。

回复收藏 0 原文

无风消散 2024-10-08 16:42:35

看看这个所以答案< /a>.我认为它显示了您想要的简化模型。 IQueryable 确实是我们的新存储库:-)。 DataContext 和 ObjectContext 是我们的工作单元。

更新2：

这是一个博客帖子描述了您可能正在寻找的型号。

更新3

将共享数据库隐藏在服务后面是明智的。这将解决几个问题：

这将使数据库对服务私有，从而在需要时更容易更改实现。
您可以将所需的验证逻辑（针对数据库 1）放入该服务中，并可以在该项目中为该验证逻辑创建测试。
访问该服务的客户端可以假设该服务及其验证逻辑的正确性。

其结果是您的应用程序将向服务发送数据以对其进行验证。调用服务来获取数据。查询自己的私有数据库（数据库3），并将三个数据源的数据本地连接在一起。我从来不喜欢使用跨数据库甚至跨服务器（在您的情况下）数据库调用并让数据库将所有内容连接在一起。事务将升级为分布式事务，并且很难预测服务器将交换多少数据。

当您抽象服务背后的共享数据库时，事情会变得更容易（至少从应用程序的角度来看）。您的应用程序调用它信任的服务，这限制了该应用程序中的代码量和测试量。您仍然想模拟对此类服务的调用，但这非常容易。它还应该解决验证多个数据源的问题。

验证始终是一个困难的部分。我非常熟悉验证应用程序块，并且喜欢它的灵活性。然而，它不是一个简单的框架，但您可以看看可以用它做什么。例如，我写了几篇关于与 O/RM 工具集成的文章和如何在验证应用程序块中“嵌入”上下文（DataContext/工作单元中的上下文）。

Look at this SO answer. I think it shows a simplified model of what you want. IQueryable<T> is indeed our new Repository :-). DataContext and ObjectContext are our Unit of Work.

UPDATE 2:

Here is a blog post that describes the model you might be looking for.

UPDATE 3

It would be wise to hide the shared databases behind a service. This will solve several problems:

This will make the database private to the service, which makes it much easier to change the implementation when needed.
You can put the needed validation logic (for database 1) in that service and can create tests for that validation logic in that project.
Clients accessing that service can assume correctness of the service, and its validation logic.

The result of this is that your application will send data to the service to validate it. Call the service to fetch data. Query its own private database (database 3) and join the data of the three data source locally together. I've never been a fan of using cross-database or even cross-server (in your situation) database calls and letting the database join everything together. Transactions will be promoted to distributed-transactions and it's hard to predict how many data the servers will exchange.

When you abstract the shared databases behind the service, things get easier (at least from your application's point of view). Your application calls services it trusts which limits the amount of code in that application and the amount of tests. You still want to mock the calls to such a service, but that would be pretty easy. It should also solve the problem of validating over multiple data sources.

Validation is always a hard part. I'm very familiar with Validation Application block, and love it for it's flexibility. It isn't however an easy framework, but you might take a peek at what you can do with it. For instance, I've written several articles about integration with O/RM tools and how to 'embed' a context (context as in DataContext/Unit of Work) in Validation Application Block.

回复收藏 0 原文