Kafka 集群建模

发布于 2025-01-12 05:31:20 字数 629 浏览 1 评论 0原文

我有一个 API 端点,它接受具有特定用户 ID 和一些其他数据的事件。我希望将这些事件广播到一些外部位置,并且我想探索使用 Kafka 作为解决方案。

我有以下要求:

  1. 具有相同 UserID 的事件应按顺序传递到外部位置。
  2. 事件应该被持久化。
  3. 如果单个外部位置出现故障,则不应延迟向其他位置的交付。

最初,根据我所做的一些阅读,感觉我想要拥有 N 个消费者,其中 N 是我想要广播到的外部位置的数量。这应该满足要求(3)。我可能还需要一个生产者,即我的 API,它将事件推送到我的 Kafka 集群。要求 (2) 应该自动出现在 Kafka 中。

我对于如何对内部 Kafka 集群方面进行建模更加困惑。同样,从我的阅读来看,拥有数百万个主题听起来是一种不好的做法,因此为每个 userID 分配一个主题并不是一种选择。我读到的另一种选择是为每个 userID 分配一个分区(比方说 M 分区)。如果我理解正确的话,这将允许要求(1)立即发生。但这也意味着我有 M 经纪人,对吗?这听起来也很不合理。

满足所有要求的最佳方式是什么?首先,我计划使用本地 Kafka 集群来托管它。

I have an API endpoint that accepts events with a specific user ID and some other data. I want those events broadcasted to some external locations and I wanted to explore using Kafka as a solution for that.

I have the following requirements:

  1. Events with the same UserID should be delivered in order to the external locations.
  2. Events should be persisted.
  3. If a single external location is failing, that shouldn't delay delivery to other locations.

Initially, from some reading I did, it felt like I want to have N consumers where N is the number of external locations I want to broadcast to. That should fulfill requirement (3). I also probably want one producer, my API, that will push events to my Kafka cluster. Requirement (2) should come in automatically with Kafka.

I was more confused regarding how to model the internal Kafka cluster side of things. Again, from the reading I did, it sounds like it's a bad practice to have millions of topics, so having a single topic for each userID is not an option. The other option I read about is having one partition for each userID (let's say M partitions). That would allow requirement (1) to happen out of the box, if I understand correctly. But that would also mean I have M brokers, is that correct? That also sounds unreasonable.

What would be the best way to fulfill all requirements? As a start, I plan on hosting this with a local Kafka cluster.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

郁金香雨 2025-01-19 05:31:20

您是对的,每个用户一个主题并不理想。

分区计数不依赖于代理计数,因此这是一个更好的设计。

如果单个外部位置发生故障,不应延迟向其他位置的交付。

这是标准的消费者群体行为,而不是主题/分区设计。

You are correct that one topic per user is not ideal.

Partition count is not dependent upon broker count, so this is a better design.

If a single external location is failing, that shouldn't delay delivery to other locations.

This is standard consumer-group behavior, not topic/partition design.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文