我可以设置过滤的星型数据库复制吗?

发布于 2024-08-11 03:06:53 字数 367 浏览 6 评论 0原文

我们有一个客户需要建立N个本地数据库,每个数据库包含一个站点的数据,然后有一个包含所有N个数据库的并集的主企业数据库。单个站点数据库中的更改需要传播到主数据库,并且主数据库中的更改需要传播到相应的单个站点数据库。

我们一直在为需要两个数据库同时保持最新的客户使用 MySQL 复制。这是双向复制。如果我们在这里尝试完全相同的方法,我们最终会得到与主数据库等效的所有 N 个本地数据库,而这不是我们想要的。每个单独的站点不仅不能看到来自其他站点的数据,而且从主站点发送该数据 N 次而不是一次可能是一种巨大的浪费。

使用 MySQL 实现这种新的星形模式有哪些选择?我知道我们只能复制某些表,但是有没有办法按记录过滤复制?

是否有任何工具可以帮助或竞争 RDBMS 更好看?

We have a client that needs to set up N local databases, each one containing one site's data, and then have a master corporate database containing the union of all N databases. Changes in an individual site database need to be propagated to the master database, and changes in the master database need to be propagated to the appropriate individual site database.

We've been using MySQL replication for a client that needs two databases that are kept simultaneously up to date. That's a bidirectional replication. If we tried exactly the same approach here we would wind up with all N local databases equivalent to the master database, and that's not what we want. Not only should each individual site not be able to see data from the other sites, sending that data N times from the master instead of just once is probably a huge waste.

What are my options for accomplishing this new star pattern with MySQL? I know we can replicate only certain tables, but is there a way to filter the replication by records?

Are there any tools that would help or competing RDBMSes that would be better to look at?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

赠佳期 2024-08-18 03:06:53

SymmetricDS 可以解决这个问题。它是支持网络、独立于数据库的数据同步/复制软件。它使用网络和数据库技术在关系数据库之间近乎实时地复制表。该软件旨在扩展大量数据库、跨低带宽连接工作并承受网络中断。

我们已使用它将 1000 多个 MySQL 零售商店数据库同步到 Oracle 公司数据库。

SymmetricDS would work for this. It is web-enabled, database independent, data synchronization/replication software. It uses web and database technologies to replicate tables between relational databases in near real time. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.

We have used it to synchronize 1000+ MySQL retail store databases to an Oracle corporate database.

ま昔日黯然 2024-08-18 03:06:53

我以前做过这个,据我所知这是最简单的方法。您应该考虑使用 Microsoft SQL Server 合并复制和行过滤。您的行过滤将设置为有一列,说明它应该转到哪个单独的站点目标。

例如,您的表可能如下所示:

ID_column | ID_column第 2 栏 |目的地

列中的数据可能如下所示:
12345 | '数据' | 'site1'

然后,您可以将合并复制“订阅者”site1 设置为根据列“目标”和值“site1”进行筛选。

本文可能会有所帮助:

筛选合并复制的已发布数据
msdn 上还有一篇名为“增强合并复制性能”的文章可能会有所帮助 - 而且您还需要了解在 SQL Server 合并复制中设置发布者和订阅者的基础知识。

祝你好运!

I've done this before, and AFAIK this is the easiest way. You should look in to using Microsoft SQL Server Merge Replication, and using Row Filtering. Your row filtering would be set up to have a column that states what individual site destination it should go to.

For example, your tables might look like this:

ID_column | column2 | destination

The data in the column might look like this:
12345 | 'data' | 'site1'

You would then set your merge replication "subscriber" site1 to filter on column 'destination' and value 'site1'.

This article will probably help:

Filtering Published Data for Merge Replication
There is also an article on msdn called "Enhancing Merge Replication Performance" which may help - and also you will need to learn the basics of setting up publishers and subscribers in SQL Server merge replication.

Good luck!

乞讨 2024-08-18 03:06:53

可能值得一看 maatkit 的 mysql-table-sync允许您使用可选的 --where 子句同步表。

Might be worth a look at mysql-table-sync from maatkit which lets you sync tables with an optional --where clause.

方觉久 2024-08-18 03:06:53

如果您需要单向复制,请使用在星型中心复制的多个数据库副本和自定义“桥接”应用程序将数据进一步移动到最终副本

If you need unidirectional replication, then use multiple copies of databases replicated in center of star and custom "bridge" application to move data further to the final one

执妄 2024-08-18 03:06:53

只是一个随机指针:Oracle lite 支持这一点。我已经针对类似的任务评估过它一次,但是它需要在所有客户端上安装一些东西,这不是一个选项。
可以在此处找到粗略的架构概述

Just a random pointer: Oracle lite supports this. I've evaluated it once for a similar task, however it needs something installed on all clients which was not an option.
A rough architecture overview can be found here

朦胧时间 2024-08-18 03:06:53

简短的回答不,你应该重新设计。

答案很长,是的,但这非常疯狂,并且设置和管理起来会很痛苦。

一种方法是在站点之间循环主数据库的复制。使用脚本从一个站点复制 30 秒,记录其到达的距离,然后转到下一个站点。您可能希望查看 replicate -do-db 和朋友 来限制复制的内容。

我不确定是否可行的另一个选择是在主办公室中有 N 个 mysql,从每个站点办公室进行复制,然后使用 联合存储引擎提供从主数据库到每个站点从数据库的通用视图。站点从站可以从主数据库进行复制并获取它们需要的任何更改。

Short answer no, you should redesign.

Long answer yes, but it's pretty crazy and will be a real pain to setup and manage.

One way would be to roundrobin the main database's replication among the sites. Use a script to replicate for say 30 seconds from a site record how far it got and then go on the the next site. You may wish to look at replicate-do-db and friends to limit what is replicated.

Another option that I'm unsure would work is to have N mysqls in the main office that replicates from each of the site offices, and then use the federated storage engine to provide a common view from the main database into the per-site slaves. The site slaves can replicate from the main database and pick up whichever changes they need.

雪花飘飘的天空 2024-08-18 03:06:53

听起来你需要一些专家的帮助——但我可能不需要。

这种复制需要有多“实时”?
某种 ETL 过程(或多个过程)可能是一种选择。我们内部使用 MS SSIS 和 Oracle; SSIS 似乎非常适合 ETL 类型的工作(但我不在特定的煤工作面工作,所以我不能说)。

数据的波动性如何?您认为数据主要是操作/事务性的吗?
您在谈论什么类型的数据量?

中央主站是否也用作其所在办公室的本地数据库?如果是的话,您可能想要改变这一点 - 让总部像远程办公室一样工作 - 这样您就可以对所有办公室一视同仁;如果不同的站点受到不同的对待,您经常会遇到问题/异常。

Sounds like you need some specialist assistance - and I'm probably not it.

How 'real-time' does this replication need to be?
Some sort of ETL process (or processes) is possibly an option. we use MS SSIS and Oracle in-house; SSIS seems to be fairly good for ETL type work (but I don't work on that specific coal face so I can't really say).

How volatile is the data? Would you say the data is mostly operational / transactional?
What sort of data volumes are you talking about?

Is the central master also used as a local DB for the office where it is located? if it is you might want to change that - have head office work just like a remote office - that way you can treat all offices the same; you'll often run into problems / anomalies if different sites are treated differently.

葬﹪忆之殇 2024-08-18 03:06:53

听起来,通过跳出直接数据库结构来实现这一点会更好。

我没有详细的答案给你,但这是我要做的高级操作:

我将从每个数据库中选择过去(合理的时间范围)内的更改列表,构建插入和删除语句统一“大”数据库上的所有数据,然后为每个特定数据库分离较小的插入和删除语句集。

然后我会运行这些。

如果传入和传出的数据有任何重叠,则此设置可能会出现“合并”问题。

由于您的时间范围构建不正确,还存在数据丢失或重复的问题。

it sounds like you would be better served by stepping outside of a direct database structure for this.

I don't have a detailed answer for you, but this is the high level of what I would do:

I would select from each database a list of changes during the past (reasonable time frame), construct the insert and delete statements that would unify all of the data on the 'big' database, and then separate smaller sets of insert and delete statements for each of the specific databases.

I would then run these.

There is a potential for 'merge' issues with this setup if there is any overlap with data coming in and out.

There is also the issue of data being lost or duplicated because your time frame were not constructed properly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文