Kafka连接接收器连接器,带有多个分区主题
我们想使用Kafka Connect接收器连接器,以将消息从Kafka复制到Mongo DB。 在我们的用例中,我们有多个主题,每个主题都有一个分区(例如,可以将主题的名称重新为topic.xxx.name)。这些主题的数量正在不断增加。 我想知道Kafka Connect Architecture是否适合此用例。如果是这样,如何才能配置增益高尺度性和并行性?任务是什么?工人人数?
We would like to use the Kafka connect sink connector in order to copy messages from Kafka to Mongo DB.
In our use case, we have multiple topics, with one partition each (the name of the topics can be regexed, for example, topic.XXX.name). The number of these topics is increasing continuously.
I wonder if the Kafka connect architecture fits this use case. If so, how can it be configured the gain high scaleability and parallelism? What will be the tasks.max? Number of workers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Kafka Connect是灵活的;答案是您需要的数量。
每个连接工人的运行任务数量主要仅受每个工作人员的JVM堆大小的限制。添加更多的工人将使您拥有更多的活动连接器。
但是,对于水槽连接器,您只能完成与消耗总主题分区的总任务。
不过,当您添加越来越多的主题时,需要担心的一件事是常见的消费者群体重新平衡。因此,建议为任何关键数据创建独立的连接器
Kafka Connect is flexible; the answer is as many as you need.
The number of running tasks per connect worker is mostly only limited by the JVM heap size of each worker. Adding more workers will allow you to have more total active connectors.
For sink connectors, however, you can only have as many total tasks as total topic partitions being consumed.
One thing to worry about, though, is frequent consumer group rebalancing as you add more and more topics. For this reason, it would be recommended to create independent connectors for any critical data