如何处理kafka消费者失败|软提交机制
我有多个消费者从具有多个分区的 1 个主题中读取(以固定批量大小)。
每个消费者在一小时开始时开始读取,将聚合(计数,总和)保留在内存中,并在一小时结束时将聚合保存在 SQL 中。 每个消费者必须尽快提交(手动/自动)读取偏移量,以便其他消费者不会读取已经读取的偏移量。否则,最终的聚合将不正确。
问题是,如果任何消费者在 SQL 中保存聚合之前失败,则失败的消费者已经提交的偏移量将被其他消费者错过。 因此将会出现数据丢失。
有什么办法可以解决这种情况吗?我研究了诸如软提交之类的东西,但我认为 Kafka 中没有这样的东西。
I have multiple consumers reading(in fixed batch sizes) from 1 topic with multiple partitions.
Each consumer starts reading at the start of an hour, keeps aggregating(count, sum) in-memory and at the end of the hour it saves the aggregations in SQL.
Each consumer has to commit(Manual/Auto) the read offsets as soon as possible, so that other consumer don't read the already read offsets. Else, the final aggregations will be incorrect.
The issue is if any consumer fails before saving aggregations in SQL, the offsets already committed by the failed consumer, will be missed by other consumers.
Hence there will be data loss.
Is there any work around to tackle this kind of scenario? I researched for something like soft commit but I think there is nothing of that sort in Kafka.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论