在微服务(投影,逻辑复制)之间同步邮政表
在以微服务为导向的体系结构中,我们需要“同步”或“项目”表,或一部分从实时的服务到另一个服务。给定以下方案:
服务A
schemaa.Account
id | firstName | lastName | createdAt | deletedAt
1 | Hello | Name | 2022-07-05T15:05:39Z | Null
2 | Test | Name | 2022-07-05T16:05:39Z | Null
服务B
schemab.Account
应从schemaa.account.account.account
id | deletedAt
1 | Null
2 | Null
术语
“投影”是一个术语主要用于事件采购的系统,因此在这里有些误导。我们的来源是另一个关系数据库(Postgres)作为目标。我认为逻辑复制是这里正确的术语,但我可能错了。
如您所见
- ,我只需要复制表
schemaa.Account
的一部分,这意味着列的子集。 - 服务B不必在该表上写,只需阅读即可。因此,同步可以单向。
- 解决方案应尽可能坚固且耐断层。想想服务B在一段时间内无法获得服务A。
- 如果它是低级数据库解决方案/工具,则必须在AWS RDS中使用。
- 快的!接近实时,不是计划的同步。
可能的解决方案
我不想重新发明轮子!我想,当我使用SNS + SQS时,我会拥有最灵活的性(例如,服务A向SNS,SER SERVICE B订阅的SQS队列发布有关数据突变的消息并添加了数据)。但是,我认为这创造了很多开销。
我目前缺乏正确的搜索词。术语逻辑复制乍一看似乎很有希望,但是我不确定复制工具将解决我的案例。我不想复制整个架构以获取备份簇,而想在微服务之间进行数据同步。 pglogical 似乎也很有希望,指令如何在AWS RDS中启用它存在。
这个问题真的很简单:我是在正确的轨道上,还是我不考虑的明显问题?
In a microservice oriented architecture we have the need to "sync" or "project" a table, or part of it, from one to another service near real-time. Given the following scenario:
Service A
SchemaA.account
id | firstName | lastName | createdAt | deletedAt
1 | Hello | Name | 2022-07-05T15:05:39Z | Null
2 | Test | Name | 2022-07-05T16:05:39Z | Null
Service B
SchemaB.account
should be synced from SchemaA.account
id | deletedAt
1 | Null
2 | Null
Terminology
"Projection" is a term primarily used in event-sourced systems, so here a bit misleading. Our source is another relational database (Postgres) as the target. I assume logical replication is the correct term here, but I may be wrong.
Requirements
- As you can see I only need to replicate part of table
SchemaA.account
, meaning a subset of the columns. - Service B won't have to write to that table, just read. So, the sync can be one-way.
- Solution should be as robust and fault-tolerant as possible. Think of service B being unavailable for some time to receive changes from Service A.
- If it is a low-level database solution/tool it must be available in AWS RDS.
- quick! near real-time, not scheduled syncs.
Possible solutions
I don't want to re-invent the wheel! Most flexibility, I guess, I would have when I use SNS + SQS (e.g. Service A publishes a message on data mutations to SNS, a SQS queue of Service B subscribes and adds data themselve). However, I think this creates alot of overhead.
I am currently lacking the right search terms. The term logical replication seems promising at first glance, but I am not sure if the replication tools are going to solve my cases. I don't want to replicate whole schema for backup clusters but data sync between microservices. Pglogical seems also very promising and an instruction how to enable it in AWS RDS exists.
The question really is that simple: am I on the right track or is there something obvious that I am not considering?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
听起来最有前途的方法是更改数据捕获,在Postgres中,它利用逻辑复制协议来捕获对数据的更改。然后可以使用这些更改将数据更新传播到其他服务。
例如,在Postgres等数据库中实现CDC的一种非常常用的工具是Debezium,它具有指南(不知道它是最新的,但至少应该是一个开始)与RDS Postgres一起使用Debezium 。 Debezium将发布对Kafka的更改(请务必谨慎对待主题的保留设置:Kafka默认值通常是此用例的滴答时间炸弹),而其他一些组件会消耗Kafka的更改,例如更新表。
这种方法的一个好处是,这意味着您可以将新列引入源数据库的表中,而无需更改目标数据库中的任何内容:更改消费者将看到新字段,但没有义务传播。
The approach that sounds most promising is Change Data Capture, which in Postgres leverages the logical replication protocol to capture changes to the data; these changes can then be used to propagate data updates to other services.
For example, one very commonly used tool for implementing CDC in databases like Postgres is Debezium, which has a guide (no idea if it's current, but it should at least be a start) to using Debezium with RDS Postgres. Debezium will publish the changes to Kafka (be careful about the retention settings for topics: the Kafka defaults are often a ticking time bomb for this use case) and some other component will consume the changes from Kafka and, e.g. update the table.
One benefit of this approach is that it means you can introduce a new column into the source database's table without necessarily having to change anything in the destination database: the change consumer will see the new field but it's under no obligation to propagate.