设计策略:跨2个不同数据库查询和更新数据

发布于 2024-09-09 17:24:25 字数 404 浏览 2 评论 0原文

我们有一个需求,需要跨 2 个不同的数据库查询数据(一个在 SQL Server 中,另一个在 Oracle 中)。

以下是需要实现的场景:

  1. 查询:从一个数据库获取数据并匹配其他数据库中的值
  2. 更新:从一个数据库获取数据并更新其他数据库中的对象

:ASP.net、C#

我们正在使用的技术 我们考虑过的选项:

  1. 一个数据库中的临时区域
  2. 链接服务器(无法采用该方法,因为组织范围内的政策不允许这样做)
  3. 创建 Web 服务
  4. 创建 2 个不同的 DAL 并使用来自 2 个源的数据执行列表操作在DAL中

我想知道处理这种情况的最佳设计策略是什么?如果是,那么这种方法的优点和缺点是什么

We have a requirement in which we need to query data across 2 different databases ( 1 in SQL Server and other in Oracle).

Here are the scenarios which need to be implemented:

  1. Query: Get the data from one database and match for values in other
  2. Update: Get the data from one database and update the objects in other

Technology that we are using: ASP.net, C#

The options that we have thought about:

  1. Staging area in one database
  2. Link Server ( can't go with the approach as it is not allowed due to organization wide policy)
  3. Create web services
  4. Create 2 different DAL and perform list operations with the data from 2 sources in DAL

I would like to know what is the best design strategy to deal with this kind of a scenario? If yes, then what are the pros and cons of that approach

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

遇见了你 2024-09-16 17:24:25

是否无法使用 SSIS 包在两台服务器之间进行数据转换并通过 ASP.Net 和 ASP.Net 调用它? c# 项目或通过按需调用的计划作业?

Is it not possible to use SSIS package to do the data transformation between 2 servers and invoke it either via ASP.Net & c# project or via schedule job invoked on demand?

指尖微凉心微凉 2024-09-16 17:24:25

其中一个数据库的结果是否足够小以有效地传递?

如果是这样,我建议将数据库视为两个独立的数据源。

如果数据集很大,那么您可能必须考虑将某种形式的 ETL 放入其中一个数据库的暂存区域中。如果您需要查询从两个数据库返回最新数据,则可能会遇到问题。因为您需要进行实时 ETL。

Will the results from one of the databases be small enough to efficiently pass around?

If so, I would suggest treating the databases as two independent datasources.

If the datasets are large, then you may have to consider some form of ETL into a staging area on one of the database. You may have issues if you need the queries to return up-to-date data from both databases. Because you will need to do a real-time ETL.

冷月断魂刀 2024-09-16 17:24:25

这里有一篇关于在 Microsoft SQL Server 和 Oracle 之间执行分布式事务的文章:

我不知道它的效果如何,但是如果它确实工作,这可能是最适合您的解决方案:

  • 它几乎肯定是跨多个数据库服务器查询的最快方法。
  • 即使在写入两个数据库时,它也应该允许真正的事务支持。

There is an article here about performing distributed transactions between Microsoft SQL server and Oracle:

I don't know how well this works, however if it does work, this will probably be the best solution for you:

  • It will almost certainly be the fastest method of querying across multiple database servers.
  • It should also allow for true transactional support even when writing to both databases.
莫相离 2024-09-16 17:24:25

最好的策略是使用链接服务器,因为它是为查询和写入异构数据库而设计的,如上所述。但显然由于你提到的政策限制,这不是一个选择。

因此,为了以最佳性能获得您想要的结果,我的建议如下:

  • 决定哪个数据库仅包含查找数据(最小数据集),您将需要对其执行查询以提取信息
  • 插入查找使用批量复制到主数据库中的临时/虚拟表中的数据(包含您想要检索并返回给调用者的大部分数据)
  • 使用存储过程或查询将临时表与主数据库中的其他表连接起来检索所需的数据集

是否将其编写为 Web 服务的决定不会改变数据检索过程。但应考虑通过使进程尽可能靠近数据库服务器(在同一台计算机上或在 LAN/高速连接链路内)来从本质上减少数据传输时间的开销。

数据更新将非常简单。这只是标准的两阶段操作,从一个阶段提取数据并更新另一个阶段。 -

The best strategy for this will be to use Linked Server, as it is designed for querying and writing to heterogeneous databases as you described above. But obviously due to the policy constraint you mentioned, this is not the option.

Therefore, to achieve the result you want in the most optimal performance, here is what I suggest:

  • Decide which database contains the lookup data only (minimal dataset) and you will need to execute a query on it to pull the info out
  • Insert the lookup data using bulk copy into a temp/dummy table in the main database (contains most of the data that you will want to retrieve and return to the caller)
  • Use stored procedure or query to join the temp table with other tables in your main database to retrieve the dataset desired

The decision to whether to write this as web service or not isn't going to change the data retrieval process. But consideration should be given in essentially reducing the overhead on data transfer time by keeping the process as close to your db server as possible either on same machine or within LAN/high speed connection link.

Data update will be quite straightforward. It will just be the standard two phase operations of pull data out from one and update the other. -

你与清晨阳光 2024-09-16 17:24:25

很难说最好的解决方案是什么。但我们有一个几乎相同的场景。

实时

对于实时数据更新,我们使用 WebServices,因为在我们的例子中,两个不同的数据库属于不同的项目。因此每个项目都提供一个WebService,可用于数据检索和数据更新。这样做的好处是,只要 Web 服务接口不改变,项目就不必关心数据库结构的改变。

静态数据:

静态数据(例如员工)将被镜像,以便更快地访问。对于如此大量的数据,我们使用平面文件进行夜间更新。

对于静态数据,我认为显式定义数据所有者很重要。对于每一条数据,应该清楚哪个数据库有原始数据,哪个数据库只有卷影副本以便更快地访问。

因此,静态数据在影子数据库中是只读的,或者只能通过指定的 Web 服务进行更新。

It's hard to tell what the best solution is. But we have a scenario that's nearly the same.

RealTime:

For realtime data updating, we are using WebServices, since in our case, the two different databases belongs to distinct projects. So every project offers a WebService which can be used for data retrieval and data update. That has the advantage, that the project must not take care for database structure changes as long the webservice interface does not change.

Static Data:

Static data (e.g. employees) will be mirrored because for faster access. For that huge amount of data we are using flat files for the nightly update.

In case of static data I think it's important to explicit define data owners. For every piece of data it should be clear which database has the original data, and which database only has shadow copies for faster access.

So Static data is readonly in the shadow database, or only updateable through designated WebServices.

同尘 2024-09-16 17:24:25

在 .NET 代码中使用多个数据源的问题是,您可能会面临 CRUD 操作失败 ACID 测试以及数据不一致的风险。

我最倾向于追随@Will A 对您的问题的评论...

设置到删除服务器的复制,然后链接两个远程服务器。

The problem with using multiple data sources in your .NET code is that you run the risk of having your CRUD ops fail ACID tests and having data inconsistencies.

I would be most inclined to pursue @Will A's comment to your question...

Set up a replication to a remove server, then link the two remote servers.

一张白纸 2024-09-16 17:24:25

拥有多个 DAL 并在应用程序中对其进行处理 - 数千并不是一个大数字,只有在达到 100,000 或数百万时才需要担心,在这种情况下您的应用程序将挂起。

使用 linq 对生成的数据集执行数据操作,而不是循环遍历它们。

Have multiple DALs and handle it in the application - thousands is not a big number, you need to worry only if you are into 100,000s or millions in which case your application will hang.

Use linq to perform data operations on the datasets that are generated rather than looping through them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文