如何使用 match_recognize 定义模式来查找不连续的有序事件?
我试图找到包含 3 个特定事件的会话,它们需要排序,这意味着 event_1 首先发生,然后是 event_2,然后是 event_3,但它们不需要一个接一个地集中。相反,任意数量的其他随机事件可以介于它们之间。如何在 match_recognize 子句中定义一个模式,以允许我使用分类器语句标记这些事件,并在序列不完整的情况下标记它们,例如仅发生 event_1,或者发生 event_1 + event_2?
或者有没有其他更有效且不涉及 match_recognize 的方法?我试图避免多次联接,因为数据很大。
这是一个用于演示的虚拟查询:
select
session_id,
event,
event_dttm
from events
match_recognize (
partition by session_id
order by event_dttm
measures
classifier as var
all rows per match with unmatched rows
pattern (???answer needed???)
define
event_1 as event = 'Click image',
event_2 as event = 'Open profile',
event_3 as event = 'Leave review');
I'm trying to find sessions that contain 3 specific events, they need to be ordered, meaning that event_1 happens first, then event_2, then event_3, but they don't need to be concentrated exactly one after another. Instead, any number of other random events can be in between them. How do I define a pattern in the match_recognize clause to allow me to mark these events with the classifier statement, and also mark them in case the sequence is incomplete, if for example only event_1 happens, or if event_1 + event_2 happens?
Or is there any other way to do this that is more efficient and doesn't involve match_recognize? I'm trying to avoid multiple joins because data is huge.
Here's a dummy query for presentation:
select
session_id,
event,
event_dttm
from events
match_recognize (
partition by session_id
order by event_dttm
measures
classifier as var
all rows per match with unmatched rows
pattern (???answer needed???)
define
event_1 as event = 'Click image',
event_2 as event = 'Open profile',
event_3 as event = 'Leave review');
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
continue
You can put another event in that is the "but not those others" and then have 0-many matches on that between:
with this data:
gives: