Debezium Server for Azure 事件中心接收器将消息发送到多个分区键

发布于 2025-01-09 19:28:39 字数 778 浏览 5 评论 0原文

我正在为 PostgreSQL azure 数据库实施 CDC。我希望将事件发送到 azure 事件中心。我当前的计划是使用 Debezium Server 和事件中心接收器来执行此操作。但是我想按表强制执行事件顺序。从这篇文章我知道我可以为此，单个主题具有多个分区，但每次仅将事件从单个表发送到特定分区。

然而，Debezium 似乎没有提供一个很好的方法来处理这个问题。您可以为要发送到的所有事件指定分区键，但不能为每个事件动态指定。我看到的唯一可以解决此问题的另一件事是自定义接收器实现或传递到配置中的自定义 EventHubProducerClient 实现。

我有什么选择来处理这个问题？是否有另一种方法来构建此解决方案，以便我不必使用分区键？或者自定义接收器实现将是我最好的选择？或者我应该放弃 debezium 并编写一个自定义侦听器/发布器？

上下文/要求

通常要运行 debezium，您需要运行一个 kafka 实例，但是如果可能的话我不想使用kafka，因为我已经计划好了使用事件中心，这似乎是两面派，而且它是另一个需要维护的服务。
当事件中心的使用者读取事件时，按表对事件进行 FIFO 排序，
所有逻辑数据库更改都将转换为事件，
团队中没有 Java 开发人员，因此自定义 (Java) 实现将扩展我们的专业知识。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

可遇━不可求 2025-01-16 19:28:39

配置示例：

debezium.source.table.include.list=dbo.TableOne,dbo.TableTwo,dbo.TableThree
debezium.source.transforms=PartitionRouting
debezium.source.transforms.PartitionRouting.type=io.debezium.transforms.partitions.PartitionRouting
debezium.source.transforms.PartitionRouting.partition.payload.fields=source.table,source.table,source.table
debezium.source.transforms.PartitionRouting.partition.topic.num=32

partition.payload.fields 设置 (参见文档) 确定跨分区分发事件的哈希函数应使用事件中的哪个字段。 source.table 将是不带架构的表名称，因此本例中为 TableOne/TableTwo/TableThree。

根据上述配置，来自 TableOne/TableTwo/TableThree 的所有事件都将被发送到一个分区（可能发送到三个不同的分区，每个分区一个）桌子）。因此，仅使用配置的 32 个分区中的 3 个。

如果设置是
fields=source.table,source.table,change.Id 那么来自 TableOne 和 TableTwo 的所有事件将被发送到它们自己的分区，而TableThree 中的事件将在所有 32 个分区之间平均分配（但特定行的所有事件将始终发送到同一分区）。

Example configuration:

debezium.source.table.include.list=dbo.TableOne,dbo.TableTwo,dbo.TableThree
debezium.source.transforms=PartitionRouting
debezium.source.transforms.PartitionRouting.type=io.debezium.transforms.partitions.PartitionRouting
debezium.source.transforms.PartitionRouting.partition.payload.fields=source.table,source.table,source.table
debezium.source.transforms.PartitionRouting.partition.topic.num=32

The partition.payload.fields setting (see docs) determines which field in the event should be used by the hash function that distributes events across partitions. source.table will be the table name without the schema, so TableOne/TableTwo/TableThree in this example.

Given the above configuration, all events from TableOne/TableTwo/TableThree will be sent to exactly one partition (probably to three different ones, one per table). So only 3 of the configured 32 partitions would be used.

If the setting was
fields=source.table,source.table,change.Id then all events from TableOne and TableTwo would be sent to their own partitions, while the events from TableThree would be evenly divided between all the 32 partitions (but all events for a particular row would always be sent to the same partition).

回复收藏 0 原文

~没有更多了~