Flink聚集功能与keyedProcesfunction and valuestate
我们有一个应用程序,可以消耗来自Kafka源的事件。处理每个元素的逻辑需要考虑以前收到的事件(具有相同的分区密钥),而无需花费时间进行窗口。第一个实现使用了GlobalWindow,并进行了汇总功能,以保留当前状态信息和触发器,始终会在Onelement Call中发射。我猜想使用keyedProcesfunction的替代方案并将状态保持在估值对象中会更加足够,因为我们并没有真正考虑到时间,也没有使用任何自定义触发。这个假设是否正确,并且这些批准中的任何一个都有弊端吗?
We have an application that consumes events from a kafka source. The logic from processing each element needs to take into account the events that were previously received (having the same partition key), without using time for windowing. The first implementation used a GlobalWindow, with an AggregateFunction for keeping the current state information and a trigger that would always fire in onElement call. I am guessing that the alternative of using a KeyedProcessFunction that and holds the state in a ValueState object would be more adequate, since we are not really taking timing into account, nor using any custom triggering. Is this assumption correct and are there any downsides to either one of these approaces?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在这种情况下,首选使用KeyedProcessfunction。它将所有相关的逻辑都放入一个对象中 - 而不必协调全球风格中发生的事情,聚合功能和触发器(也许也是驱逐者)。我发现这是在更可维护和可测试的实施中发现的结果,而且您对国家管理有更直接的控制。
我看不到基于窗口的解决方案的任何优势。
In prefer using a KeyedProcessFunction in cases like this. It puts all of the related logic into one object -- rather than having to coordinate what's going on in a GlobalWindow, an AggregateFunction, and a Trigger (and perhaps also an Evictor). I find this results in implementations that are more maintainable and testable, plus you have more straightforward control over state management.
I don't see any advantages to a solution based on windows.