同步两个异构数据库

发布于 2024-08-14 18:01:12 字数 111 浏览 7 评论 0原文

我有 2 个异构数据库。 一种在 mysql 中,一种在 ms sql 中。

我想让他们保持同步。

数据会定期流动,信息流动将是双向的

有人有任何策略/方法吗?

I have 2 heterogeneous databases.
One in mysql and one in ms sql.

I want to keep them in sync.

There will be flow of data periodically and information flow will be both ways

Anyone got any strategies / approaches to it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

画▽骨i 2024-08-21 18:01:12

假设您不打算使用某种形式的现成解决方案,那么您有几个选择。基本上,您想要做的是找到一种方法来捕获一个数据库中所做的更改并将它们复制到另一个数据库中。

完整提取和 Delta
对要同步的表中的每一行进行按键排序的完整转储,并将其与上次运行同步的转储进行逐行比较。对输出进行排序可以使比较过程更快,因为您可以确定行是否已更改、删除或删除,而无需

  • 优点:保证捕获所有更改。
  • 优点:易于实施。
  • 缺点:慢。
  • 缺点:运行时会对数据库产生大量负载。

对于小型或中型数据库,此选项应该非常可行。

交易日志
分析数据库中的事务日志以找出更改的内容,并将这些更改应用到其他数据库。

  • 优点:更快,因为它不需要从数据库中读取所有内容
  • 优点:实施起来仍然相对简单
  • 缺点:DBA 有时会清除日志以解决生产问题。这可能会导致错过未同步的更改。

如果您可以依赖可用的日志,这可能是个好主意。

触发器
使用触发器记录更改,并将其复制到其他数据库。

  • 优点:快速,因为它只捕获更改。
  • 缺点:增加每笔交易的开销

应用程序中的同步
只需确保应用程序写入两个数据库即可。

  • 优点:数据库没有实际开销。
  • 缺点:不可靠。所需要的只是一个人忘记写入两个数据库。

如果应用程序仅通过几个受控模块写入数据库(即没有太多地方会忘记更新两个数据库),则可以使此功能起作用。在管理较少的设置中(即多个应用程序/不受控制或结构不良的数据库访问/临时脚本),这根本不是一个选项。

Assuming you aren't going to use some form of ready made solution, you've got a few options open to you. Basically what you're trying to do is find a way to capture the changes made in one database and replicate them in the other database.

Full Extract and Delta
Take a complete, sorted by key, dump of every row in the table(s) you want to sync and compare it row-by-row against the dump from the last sync you ran. Having the output sorted makes the compare process a lot quicker, as you can figure out if a row has been changed, removed or deleted without

  • Pro: Guaranteed to capture all changes.
  • Pro: Simple to Implement.
  • Con: Slow.
  • Con: Will create a lot of load on the database when running.

This option should be quite viable for smaller or medium sized databases.

Transaction Logs
Analyze the transaction logs from the database in order to find out what changed, and apply those changes to the other database.

  • Pro: Faster, as it doesn't need to read everything out of the database
  • Pro: Still relatively simple to implement
  • Con: DBAs sometimes have to purge logs to resolve production issues. This can result in missed changes that don't get synced.

Possibly a good idea if you can count on the logs being available.

Triggers
Use triggers to record the changes, and replicate them to the other database.

  • Pro: Fast, as it only captures the changes.
  • Con: Adds overhead to every transaction

Synchronization in the Application
Simply make sure the application writes to both databases.

  • Pro: No real overhead for the database.
  • Con: Unreliable. All it takes is one person forgetting to write to both databases.

This could be made to work if the application writes to the database only through a few controlled modules (ie. there's not many places to forget to update both databases). In a less managed setup (ie. multiple applications / uncontrolled or poorly factored database access / ad-hoc scripts) this simply isn't an option.

梦里°也失望 2024-08-21 18:01:12

阿南德,你可以在谷歌中找到这个。

同步 mysql 和 mssql

我没用过这个软件,但是他们提供免费试用

Anand, you can find this in Google.

sync mysql and mssql

I've not used this software, but they offer a free trial

莫相离 2024-08-21 18:01:12

首先需要更多信息:

  • 您可以承受多少同步延迟? 5秒? 5小时? 15小时?等等
  • 数据库之间有多少转换?视图中没有可以处理的任何内容或琐碎的更改,还是更实质性的内容?
  • 数据量是什么样的?表有多大,每天有多少数据变化,数据是否分区?
  • 并发要求是什么?您可以关闭用户访问几分钟吗?
  • 双向更改会发生在同一个表还是不同的表上?
  • 数据质量要求是什么?如果两个不同的事务(每个事务都针对同一个表中的同一行但在不同的数据库中)大约在同一时间更新,并且一个事务在另一个事务上进行更新,这样可以吗?
  • 粒度是多少?您是否需要复制发生的每个事务或仅复制某个时间点的当前快照(即使该时间点每 5 分钟一次)?

一般来说,如果您需要实时同步,那么您最终会得到复制解决方案。这通常可以处理非常少量的转换(通常通过存储过程发生)。它通常是嗅探日志的商业解决方案。由于大多数人不希望代码依赖于日志格式,因此他们几乎总是使用打包的解决方案。

如果您不需要实时同步、拥有大量数据或有重大转换要求,那么您最终会选择 ETL 解决方案。有很多可供选择,但它们大多是商业的。另一方面,如果您花时间了解最佳实践,那么您自己开发它们并不困难。奇怪的是,实际上并没有太多谈论。不管怎样,Adam Luchjenbroers 很好地识别了大多数 ETL 方法。如果您能够承受快照之间的松散事务,我建议您使用文件增量方法 - 因为它是最准确的方法,因为所有其他方法都依赖于时间戳、触发器或日志,而这些方法无法捕获所有更改。

First more info is needed:

  • how much latency in the syncrhonization can you afford? 5 seconds? 5 hours? 15 hours? etc.
  • how much transformation between databases? Nothing or trivial changes that can be handled in a view, or something more substantial?
  • what do the data volumes look like? How large are the tables, how much data changes daily, and is the data partitioned?
  • what are the concurrency requirements? Can you shut off user access for a few minutes?
  • will bi-directional changes be occurring to the same table or to different tables?
  • what are the data quality requirements? is it ok if two different transactions, each to the same row in the same table but in different databases gets updated at about the same time - and one steps on the other?
  • what is the granularity? do you need to replicate each transaction that occurs or just the current snapshot at a point in time (even if that point in time is every 5 minutes)?

In general, if you need real-time synchronization then you end up with a replication solution. This can typically handle a very small amount of transformation (usually happens via stored procs). It is typically a commercial solution that sniffs logs. Since most people don't want to have a code dependency on log formats they almost always go with a packaged solution.

If you don't need real-time synchronization, have vast data volumes or have significant transformation requirements then you end up with an ETL solution. There are quite a few to choose from, but they are mostly commercial. On the other hand, they aren't difficult to develop yourself - if you take the time to understand best practices. Which oddly enough, really aren't talked about much. Anyhow, Adam Luchjenbroers did a good job identifying most of the approaches with ETL. I recommend the file delta approach if you can afford to loose transactions between snapshots - since it is otherwise the most accurate approach since all of the others rely on timestamps, triggers or logs which do not capture all changes.

暖心男生 2024-08-21 18:01:12

看看synchrodb.com

SynchroDB会比较两个数据库的内容,并根据参数和源数据库的内容更新目标数据库的内容。

SynchroDB 使用功能键的概念来匹配源数据库和目标数据库中的行。

SynchroDB 是高度可定制的。默认行为是:

• 在目标数据库中插入它不包含的行。

•对于属于两个数据库的行,将其字段一一比较,并在目标数据库中更新不同的字段。

•从目标数据库中删除源数据库中不存在的行。

SynchroDB 是免费的。

Have a look to synchrodb.com

SynchroDB compares the contents of two databases and updates the contents of the destination database according to the parameters and the contents of the source database.

SynchroDB uses the concept of Functional Keys to match rows in the source and destination databases.

SynchroDB is highly customizable. The default behavior is :

•To insert in the destination database the rows it doesn't contain.

•For the rows that belong to both database, to compare their fields one by one and to update in the destination databases the fields that are different.

•To delete from the destination database the rows that are not present in the source database.

SynchroDB is free.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文