系统设计 - 仅在持续到数据库后才更新缓存?
看完此 Martin Klepmann的真棒谈话可以摆脱2阶段的组合,仅在正确更新数据库时,我有几个与更新缓存有关的问题。
问题陈述
Lets say you have a Redis cache which stores the user's profile pic and a Postgres database which is used for all the User related operations(creating, updation, deletion, etc)
I want to update my Redis cache only and only when a new user has been successfully added to my database.
我该如何使用kafka做到这一点?
如果我要以视频中给出的示例为例,那么工作流将遵循这样的内容:
- 用户注册
- 请求由用户注册Micro Service用户注册微服务处理
- ,将新的条目插入用户表中。
- 然后在
user_created
主题中生成用户创建事件
。 - 缓存人口微服务消耗新创建的
用户创建事件
- 缓存群体微服务更新了redis缓存。
问题开始如果用户注册微服务在写入数据库后刚崩溃,但未能将事件发送到Kafka?
处理此操作的正确方法是什么?
- 用户注册微服务是否维护其发布的最后一个事件?如何可靠地做到这一点?它写给数据库吗?然后,问题再次开始,如果将事件发布给Kafka,但在更新其最后已知偏移之前就失败了。
After watching this awesome talk by Martin Klepmann about how Kafka can be used to stream events so that we can get rid of 2-phase-commits, I have a couple of questions related to updating a cache only when the database is updated properly.
Problem Statement
Lets say you have a Redis cache which stores the user's profile pic and a Postgres database which is used for all the User related operations(creating, updation, deletion, etc)
I want to update my Redis cache only and only when a new user has been successfully added to my database.
How can I do that using Kafka ?
If I am to take the example given in the video then the workflow would follow something like this:
- User registers
- Request is handled by User Registration Micro service
- User Registration Microservice inserts a new entry into the User's table.
- Then generates an
User Creation Event
in theuser_created
topic. - Cache population microservice consumes the newly created
User Creation Event
- Cache population microservice updates the redis cache.
The problem starts what would happen if the User Registration Microservice crashed just after writing to the database, but failed to send the event to Kafka ?
What would be the correct way of handling this ?
- Does the User Registration Microservice maintain the last event it published ? How can it reliably do that ? Does it write to a DB ? Then the problem starts all over again, what if it published the event to Kafka but failed before it could update its last known offset.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
可以为此采用三种广泛的方法:
有交易量输出模式,其中与将新条目插入用户表相同的交易中,将相应的用户创建事件插入到Outbox表中。然后,一些过程最终查询了Outbox表,将该表中的事件发布给Kafka,并删除表中的事件。由于插入物处于相同的交易中,因此它们既要么都会发生,要么都不发生。除非在此过程中发布一个错误的错误,以将输出框发布给Kafka,这保证了每个用户插入最终都会在Kafka中发布一个关联的事件(至少一次)。
有一个更多的事件源模式,您可以将用户创建事件发布到KAFKA,然后根据事件将一些消费过程插入到用户表中。由于这种情况会延迟发生,所以这强烈表明用户注册服务需要保持状态的状态,以确保其发布的创建事件(与Kafka和Postgres的结合在一起是事实的来源)。由于Kafka允许任意许多消费者消费一条消息,因此不同的消费者可以更新Redis。
更改数据捕获(例如Debezium)可以用来将Postgres的写入日志绑定(因为Postgres实际上是在引擎盖下的事件来源...),并发布了一个实质上说的事件:“此行已插入用户表“到卡夫卡。然后,该事件的消费者可以将其转化为用户创建的事件。
CDC从某种意义上说,将交易量输出箱移至基础架构中,以要求以后重建其本质上扔掉的上下文(这并不总是可能)。
也就是说,我强烈建议不要将____创建成为微服务,同样,我也强烈建议您不要像Redis这样的溜冰场。这两种都闻起来像试图通过添加微服务和缓存来使建筑缺陷的纸质纸质纸。
我建议不是一种一英尺一英尺的事件进水方法,但是如果一个人开始在这里开始,则要求注册服务的要求突然打开了可能消除需求的可能性Redis,限制对像Kafka一样的东西的需求,并允许您将DB的存在视为实现细节。
There are three broad approaches one can take for this:
There's the transactional outbox pattern, wherein, in the same transaction as inserting the new entry into the user table, a corresponding user creation event is inserted into an outbox table. Some process then eventually queries that outbox table, publishes the events in that table to Kafka, and deletes the events in the table. Since the inserts are in the same transaction, they either both occur or neither occurs; barring a bug in the process which publishes the outbox to Kafka, this guarantees that every user insert eventually has an associated event published (at least once) to Kafka.
There's a more event-sourcingish pattern, where you publish the user creation event to Kafka and then some consuming process inserts into the user table based on the event. Since this happens with a delay, this strongly suggests that the user registration service needs to keep state of which users it has published creation events for (with the combination of Kafka and Postgres being the source of truth for this). Since Kafka allows a message to be consumed by arbitrarily many consumers, a different consumer can then update Redis.
Change data capture (e.g. Debezium) can be used to tie into Postgres' write-ahead log (as Postgres actually event sources under the hood...) and publish an event that essentially says "this row was inserted into the user table" to Kafka. A consumer of that event can then translate that into a user created event.
CDC in some sense moves the transactional outbox into the infrastructure, at the cost of requiring that the context it inherently throws away be reconstructed later (which is not always possible).
That said, I'd strongly advise against having ____ creation be a microservice and I'd likewise strongly advise against a RInK store like Redis. Both of these smell like attempts to paper over architectural deficiencies by adding microservices and caches.
The one-foot-on-the-way-to-event-sourcing approach isn't one I'd recommend, but if one starts there, the requirement to make the registration service stateful suddenly opens up possibilities which may remove the need for Redis, limit the need for a Kafka-like thing, and allow you to treat the existence of a DB as an implementation detail.