您对分布式系统中没有事件的反应如何?

发布于 2024-11-03 15:25:53 字数 758 浏览 2 评论 0原文

我有一个收集会话数据的系统。会话由许多不同的事件组成,例如“会话已启动”和“操作 X 已执行”。无法确定会话何时结束,因此会定期发送心跳事件。

这是主要的复杂性:无法确定会话是否已结束,唯一的方法是尝试对事件的缺失做出反应,即不再有心跳。如何在分布式系统中高效、正确地做到这一点?

以下是该问题的更多背景信息:

然后必须将事件组装成表示会话的对象。会话对象稍后会使用来自其他系统的附加数据进行更新,最终用于计算会话数量、平均会话长度等。

系统必须水平扩展,因此有多个服务器接收事件,并且处理它们的多个服务器。属于同一会话的事件可以发送到不同的服务器并由不同的服务器处理。这意味着无法保证它们会按顺序处理,并且存在额外的复杂性,这意味着事件可以重复(并且始终存在某些事件在到达我们的服务器之前或在处理时丢失的风险)。

其中大部分已经存在,但我没有很好的解决方案来有效、正确地确定会话何时结束。我现在这样做的方法是定期搜索“不完整”会话对象的集合,查找在等于两次心跳的时间内尚未更新的任何对象,并将它们移动到具有“完整”会话的另一个集合。这种操作耗时且效率低下,并且水平扩展性不好。基本上,它包括对表中表示最后时间戳的列进行排序,并过滤掉任何不够旧的行。听起来很简单,但是很难并行化,如果你这样做太频繁,你将不会做任何其他事情,数据库将忙于过滤你的数据,如果你不经常这样做,每次运行都会很慢,因为有太多需要处理。

我想对会话有一段时间没有更新时做出反应,而不是轮询每个会话以查看它是否已更新。

更新:只是为了给您一种规模感;任何时候都有数十万个活跃会话,最终将达到数百万个。

I have a system that collects session data. A session consists of a number of distinct events, for example "session started" and "action X performed". There is no way to determine when a session ends, so instead heartbeat events are sent at regular intervals.

This is the main complication: without a way to determine if a session has ended the only way is to try to react to the absence of an event, i.e. no more heartbeats. How can I do this efficiently and correctly in a distributed system?

Here is some more background to the problem:

The events must then be assembled into objects representing sessions. The session objects are later updated with additional data from other systems, and eventually they are used to calculate things like the number of sessions, average session length, etc.

The system must scale horizontally, so there are multiple servers that receive the events, and multiple servers that process them. Events belonging to the same session can be sent to and processed by different servers. This means that there's no guarantee that they will be processed in order, and there are additional complications that meant that events can be duplicated (and there's always the risk that some are lost, either before they reach our servers, or when processed).

Most of this exists already, but I have no good solution to how to efficiently and correctly determine when a session has ended. The way I do it now is to periodically search through the collection of "incomplete" session objects looking for any that have not been updated in an amount of time equal to two heartbeats, and moving these to another collection with "complete" sessions. This operation is time consuming and inefficient, and it doesn't scale well horizontally. Basically it consists of sorting a table on a column representing the last timestamp and filtering out any rows that aren't old enough. Sounds simple, but it's hard to parallelize, and if you do it too often you won't be doing anything else, the database will be busy filtering your data, if you don't do it often enough each run will be slow because there's too much to process.

I'd like to react to when a session has not been updated for a while, not poll every session to see if it's been updated.

Update: Just to give you a sense of scale; there are hundreds of thousands of sessions active at any time, and eventually there will be millions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一抹淡然 2024-11-10 15:25:53

我想到的一种可能性是:

在跟踪会话的数据库表中,添加一个时间戳字段(如果您还没有)来记录会话上次“活动”的时间。每当您收到心跳时更新时间戳。

创建会话时,安排一个“计时器事件”在适当的延迟后触发,以检查会话是否应过期。当计时器事件触发时,检查会话的时间戳以查看在计时器等待的时间间隔内是否有更多活动。如果是这样,则会话仍处于活动状态,因此安排另一个计时器事件以稍后再次检查。如果没有,则会话已超时,因此请将其删除。

如果使用这种方法,每个会话将始终有一台服务器负责检查它是否过期,但不同的服务器可以负责不同的会话,因此可以均匀地分散工作负载。当心跳到来时,哪个服务器处理它并不重要,因为它只是更新数据库中的时间戳,该数据库(大概)在所有服务器之间共享。

仍然涉及一些轮询,因为您将收到定期计时器事件,使您检查会话是否已过期,即使它尚未过期。如果您可以在每次心跳到达时取消挂起的计时器事件,则可以避免这种情况,但对于多个服务器来说,这是很棘手的:处理心跳的服务器可能与安排计时器的服务器不同。无论如何,所涉及的数据库查询是轻量级的:只需通过主键查找一行(会话记录),无需排序或不等式比较。

One possibility that comes to mind:

In your database table that keeps track of sessions, add a timestamp field (if you don't have one already) that records the last time the session was "active". Update the timestamp whenever you get a heartbeat.

When you create a session, schedule a "timer event" to fire after some suitable delay to check whether the session should be expired. When the timer event fires, check the session's timestamp to see if there's been more activity during the interval that the timer was waiting. If so, the session is still active, so schedule another timer event to check again later. If not, the session has timed out, so remove it.

If you use this approach, each session will always have one server responsible for checking whether it's expired, but different servers can be responsible for different sessions, so the workload can be spread around evenly. When a heartbeat comes in, it doesn't matter which server handles it, because it just updates a timestamp in a database that's (presumably) shared between all the servers.

There's still some polling involved since you'll get periodic timer events that make you check whether a session is expired even when it hasn't expired. That could be avoided if you could just cancel the pending timer event each time a heartbeat arrives, but with multiple servers that's tricky: the server that handles the heartbeat may not be the same one that has the timer scheduled. At any rate, the database query involved is lightweight: just looking up one row (the session record) by its primary key, with no sorting or inequality comparisons.

香橙ぽ 2024-11-10 15:25:53

所以你正在收集心跳;我想知道您是否可以有一个批处理(或其他东西)来运行收集的心跳,寻找暗示会话结束的模式。

准确度取决于心跳的规律性以及扫描收集的心跳的频率。

优点是您通过单一机制处理所有心跳(在一个地方 - 您不必单独轮询每个心跳),因此应该能够扩展 - 如果它是一个以数据库为中心的解决方案,应该能够处理大量数据,对吧?

可能有一个更优雅的解决方案,但我的大脑现在有点满:)

So you're collecting heartbeats; I'm wondering if you could have a batch process (or something) that ran across the collected heartbeats looking for patterns that implied the end of a session.

The level of accuracy is governed by how regular the heartbeats are and how often you scan across the collected heartbeats.

The advantage is you're processing all heartbeats through a single mechanism (in one spot - you don't have to poll each heartbeat on it's own) so that should be able to scale - if it was a database centric solution that should be able to cope with lots of data, right?

There might be a more elegant solution but my brains a bit full just now :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文