如何最好地集成多个系统?

发布于 2024-07-06 07:12:09 字数 874 浏览 7 评论 0 原文

好吧,在我工作的地方,我们维护了过去几十年编写的相当数量的系统。

这些系统是多种多样的,因为可以使用多种操作系统(Linux、Solaris、Windows)、多种数据库(oracle、sybase 和 mysql 的多个版本),甚至多种语言(C、C++、JSP、PHP 和许多其他语言)。用过的。

每个系统都是相当自治的,即使是以将相同数据输入多个系统为代价的。

管理层最近决定,我们应该调查如何让所有系统愉快地相互交谈并共享数据。

请记住,虽然我们可以对任何单个系统进行软件更改,但管理层可能不会接受对任何一个(或多个)系统的完全重写。

这里的一些开发人员的第一个想法很简单:如果系统 A 需要来自系统 B 的数据,它应该连接到系统 B 的数据库并获取它。 同样,如果它需要提供 B 数据,它应该将其插入到 B 的数据库中。

由于使用的数据库(和版本)混乱,其他开发人员认为我们应该有一个新数据库,将所有其他系统的表组合起来,以避免处理多个连接。 通过这样做,他们希望我们能够合并一些表并消除冗余的数据条目。

大约就是在这个时候,我被请来征求我对整个混乱局面的看法。

使用数据库作为系统通信手段的整个想法对我来说很有趣。 业务逻辑必须放置到多个系统中(如果系统 A 想要将数据添加到系统 B,则在执行插入之前更好地了解 B 的有关数据的规则),多个系统很可能必须执行某种形式的数据库轮询才能找到对数据的任何更改,持续维护都将是一件令人头疼的事情,因为对数据库模式的任何更改现在都会传播到多个系统。

我的第一个想法是花时间为不同的系统编写 API/服务,一旦编写完成,就可以轻松地用来来回传递/检索数据。 许多其他开发人员认为这太过分了,而且比仅仅使用数据库需要更多的工作。

那么让这些系统相互对话的最佳方法是什么?

Ok where I work we have a fairly substantial number of systems written over the last couple of decades that we maintain.

The systems are diverse in that multiple operating systems (Linux, Solaris, Windows), Multiple Databases (Several Versions of oracle, sybase and mysql), and even multiple languages (C, C++, JSP, PHP, and a host of others) are used.

Each system is fairly autonomous, even at the cost of entering the same data into multiple systems.

Management recently decided that we should investigate what it will take to get all the systems happily talking to each other and sharing data.

Keep in mind that while we can make software changes to any of the individual systems, a complete rewrite of any one system (or more) is not something management is likely to entertain.

The first thought of several of the developers here was the straight forward: If system A needs data from system B it should just connect to system B's database and get it. Likewise if it needs to give B data it should just insert it into B's database.

Due to the mess of databases (and versions) used, other developers were of the opinion that we should have one new database, combining the tables from all the other systems to avoid having to juggle multiple connections. By doing this they hope that we might be able to consolidate some tables and get rid of the redundant data entry.

This is about the time I was brought in for my opinion on the whole mess.

The whole idea of using the database as a means of system communication smells funny to me. Business logic will have to be placed into multiple systems (if System A wants to add data to System B it better understand B's rules concerning the data before doing the insert), several systems will most likely have to do some form of database polling to find any changes to their data, continuing maintenance will be a headache, as any change to a database schema now propagates several systems.

My first thought was to take the time and write APIs/Services for the different systems, which once written could be easily used to pass/retrieve data back and forth. A lot of the other developers feel that is excessive and far more work than just using the database.

So what would be the best way to go about getting these systems to talk to each other?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

GRAY°灰色天空 2024-07-13 07:12:09

集成不同的系统是我的日常工作。

如果我是您,我会尽力避免直接从系统 B 内访问系统 A 的数据。从系统 B 更新系统 A 的数据库是极其不明智的。 让业务逻辑如此分散是与良好实践完全相反的。 你最终会后悔的。

中央数据库的想法并不一定是坏的……但是所涉及的工作量可能在从头开始重写系统的一个数量级之内。 这当然不是我会尝试的事情,至少以你所描述的形式。 它可以成功,但它比点对点集成方法要困难得多,并且需要更多的纪律。 有趣的是,听到它与将数据直接推入其他系统的“牛仔”方法相提并论。

总的来说,你的直觉看起来相当不错。 有几种方法。 您提到了一个:实施服务。 这不是一个坏方法,特别是当您需要实时更新时。 另一个是一个单独的集成应用程序,负责整理数据。 这是我通常采用的方法,但通常是因为我无法更改正在集成的系统来请求其所需的数据; 我必须将数据推入。就您而言,服务方法并不是一个坏方法。

我想说的一件事对于第一次接触系统集成的人来说可能并不明显,那就是系统中的每条数据都应该有一个单一的、权威的事实点。 如果数据是重复的(确实是重复的),并且副本彼此不一致,则该数据的副本必须被认为是正确的。 没有其他方法可以在不使复杂性以指数速度飙升的情况下集成系统。 意大利面条式集成就像意大利面条式代码,应该不惜一切代价避免它。

祝你好运。

编辑:

中间件解决了传输问题,但这不是集成的核心问题。 如果系统之间足够接近,一个应用程序可以将数据直接推送到另一个应用程序,那么它们可能也足够接近,一个应用程序提供的服务可以由另一个应用程序直接调用。 在你的情况下我不会推荐中间件。 您可能会从中获得一些好处,但这会被增加的复杂性所抵消。 您需要一次解决一个问题。

Integrating disparate systems is my day job.

If I were you, I would go to great effort to avoid accessing System A's data from directly within System B. Updating System A's database from System B is extremely unwise. It is exactly the opposite of good practice to make your business logic so diffuse. You will end up regretting it.

The idea of the central database isn't necessarily bad ... but the amount of effort involved is probably within an order of magnitude of rewriting the systems from scratch. It is certainly not something I would attempt, at least in the form you describe. It can succeed, but it is much, much harder and it takes a lot more discipline than the point-to-point integration approach. It's funny to hear it suggested in the same breath as the 'cowboy' approach of just shoving data directly into other systems.

Overall your instincts seem pretty good. There are a couple of approaches. You mention one: implementing services. That's not a bad way to go, especially if you need updates in real time. The other is a separate integration application that is responsible for shuffling the data around. That's the approach I usually take, but usually because I can't change the systems I'm integrating to ask for the data it needs; I have to push the data in. In your case the services approach isn't a bad one.

One thing I would like to say that might not be obvious to someone coming to system integration for the first time is that every piece of data in your system should have a single, authoritative point of truth. If the data is duplicated (and it is duplicated), and the copies disagree with each other, the copy in the point of truth for that data must be taken to be correct. There is just no other way to integrate systems without having the complexity scream skyward at an exponential rate. Spaghetti integration is like spaghetti code, and it should be avoided at all costs.

Good luck.

EDIT:

Middleware addresses the problem of transport, but that is not the central problem in integration. If the systems are close enough together that one app can shove data directly in to another, they're probably close enough that a service offered by one can be called directly by another. I wouldn't recommend middleware in your case. You might get some benefit from it, but that would be outweighed by the increased complexity. You need to solve one problem at a time.

盛装女皇 2024-07-13 07:12:09

听起来您可能想研究 消息队列面向消息的中间件

MSMQJava 消息服务 就是示例。

Sounds like you may want to investigate Message Queuing and message-oriented middleware.

MSMQ and Java Message Service being examples.

蓝天 2024-07-13 07:12:09

看来你是在寻求意见,所以我就提供我的意见。

我同意其他开发人员的观点,即为所有不同的系统编写 API 是过度的。 如果您只接受创建单个数据库的其他建议,您可能会更快地完成它并对其拥有更多的控制权。

It seems you are looking for opinions, so I will provide mine.

I agree with the other developers that writing an API for all the different systems is excessive. You would likely get it done faster and have much more control over it if you just take the other suggestion of creating a single database.

北城孤痞 2024-07-13 07:12:09

您将面临的挑战之一是调整每个不同系统中的数据,以便首先将其集成。 您想要集成的每个系统可能都拥有完全不同的数据集,但更有可能的是数据是重叠的。 在深入编写 API:s 之前(根据您的描述,这也是我会采取的路线),我建议您尝试为需要集成的数据提出一个逻辑数据模型。 然后,该数据模型将帮助您利用不同系统中拥有的数据,并使其对其他数据库更有用。

我还强烈推荐采用迭代方法进行集成。 对于遗留系统来说,存在太多的不确定性,尝试一次性设计和实现它风险太大。 从小事做起,逐步形成一个合理集成的系统。 “完全集成”几乎不值得追求。

One of the challenges that you will have is to align the data in each of the different systems so that it can be integrated in the first place. It may be that each of the systems that you want to integrate holds entirely different sets of data but more likely it is data that is overlapping. Before diving into writing API:s (which is the route I would take as well given your description) I would recommend that you try and come up with a logical data model for the data that needs to be integrated. This data model will then help you leverage the data that you are having in the different systems and make it more useful to the other databases.

I would also highly recommend an iterative approach to the integration. With legacy systems there is so much uncertainty that trying to design and implement it all in one go is too risky. Start small and work your way to a reasonably integrated system. "Fully integrated" is hardly ever worth aiming for.

旧夏天 2024-07-13 07:12:09

通过推送/戳数据库直接连接会将一个系统的许多内部细节暴露给另一个系统。 有明显的缺点:升级一个系统可能会破坏另一个系统。 此外,一个系统如何访问另一个系统的数据库可能存在技术限制(考虑一下 Unix 上用 C 语言编写的应用程序如何与 Windows 2003 Server 上运行的 SQL Server 2005 数据库交互)。

您必须决定的第一件事是“主数据库”所在的平台,以及提供所需粘合剂的中间件也是如此。 我建议您考虑面向消息的中间件,而不是转向 API 级别的中间件集成(例如 CORBA)。 MS Biztalk、Sun 的 eGate 和 Oracle 的 Fusion 都是其中的一些选择。

您关于新数据库的想法是朝着正确方向迈出的一步。 您可能想阅读一些有关企业实体聚合模式的内容。

将“数据集成”与中间件相结合是可行的方法。

Directly interfacing via pushing/ poking databases exposes a lot of internal detail of one system to another. There are obvious disadvantages: upgrading one system can break the other. Moreover, there can be technical limitations in how one system can access the database of the other (consider how an application written in C on Unix will interact with a SQL Server 2005 database running on Windows 2003 Server).

The first thing you have to decide is the platform where the "master database" will reside, and the same for the middleware providing the much required glue. Instead of going towards API level middleware-integration (such as CORBA), I would suggest you to consider Message Oriented Middleware. MS Biztalk, Sun's eGate and Oracle's Fusion can be some of the options.

Your idea of a new database is a step in the right direction. You might like to read a little bit on Enterprise Entity Aggregation pattern.

A combination of "data integration" with a middleware is the way to go.

喜爱纠缠 2024-07-13 07:12:09

如果您打算采用中间件+单一中央数据库策略,您可能需要考虑分多个阶段实现这一目标。 下面是一个可以考虑的逻辑步骤流程:

  1. 为不同系统实现服务/API,公开每个系统的功能
  2. 中间件的实现,访问这些 API 并为所有系统提供接口以访问其他系统的数据/服务(如果可用,则从中央源访问数据,否则从另一个系统获取数据)
  3. 仅实施中央数据库,无数据
  4. 在中间件级别实施缓存/数据存储服务,无论何时访问数据,都可以在中央数据库中存储/缓存数据从任何系统,例如,如果系统 B 通过中间件获取系统 A 的记录 1-5,则中间件数据缓存服务可以将这些记录存储在中央数据库中,下次从中央数据库获取这些记录时,
  5. 数据清理可以并行发生
  6. 您还可以创建一个导入机制,每天将数据从多个系统推送到中央数据库(自动或手动),

这样,工作量就会分布在多个里程碑上,数据首先会逐渐存储在中央数据库中-首先访问存储的基础。

If you are going towards Middleware + Single Central Database strategy, you might want to consider achieving this in multiple phases. Here's a logical stepped process which can be considered:

  1. Implementation of services/APIs for different systems which expose the functionality for each system
  2. Implementation of Middleware which accesses these APIs and provides an interface to all the systems to access the data/services from other systems (accesses data from central source if available, else gets it from another system)
  3. Implementation of Central Database only, without data
  4. Implementation of Caching/Data-Storage Services at the Middleware level which can store/cache data in the central database whenever that data is accessed from any of the Systems e.g. IF System A's records 1-5 are fetched by System B through Middleware, the Middleware Data Caching Services can store these records in the centralized database and the next time these records will be fetched from the central database
  5. Data Cleansing can happen in Parallel
  6. You can also create a import mechanism to push data from multiple systems to the central database on a daily basis (automated or manual)

This way, the effort is distributed across multiple milestones and data is gradually stored in the central database on first-accessed-first-stored basis.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文