如何通过连续日期变量检查值
我在雪花中有数据库表,如果有flag_1 == 1在至少3天时,我需要检查每个客户。 FLAG_1指示该订单是否包含任何特定商品。并使用customer_id和flag_2创建新表。我真的不知道如何处理这个问题。
示例表:
CREATE TABLE TMP_TEST
(
CUSTOMER_ID INT,
ORDER_DATE DATE,
FLAG_1 INT
);
INSERT INTO TMP_TEST (CUSTOMER_ID, ORDER_DATE, FLAG_1)
VALUES
(001, '2020-04-01', 0),
(001, '2020-04-02', 1),
(001, '2020-04-03', 1),
(001, '2020-04-04', 1),
(001, '2020-04-05', 1),
(001, '2020-04-06', 0),
(001, '2020-04-07', 0),
(001, '2020-04-08', 0),
(001, '2020-04-09', 1),
(002, '2020-04-10', 1),
(002, '2020-04-11', 0),
(002, '2020-04-12', 0),
(002, '2020-04-13', 1),
(002, '2020-04-14', 1),
(002, '2020-04-15', 0),
(002, '2020-04-16', 1),
(002, '2020-04-17', 1);
预期输出表:
CUSTOMER_ID FLAG_2
001 1
002 0
I have database table in SNOWFLAKE, where I need check for each customer if there is FLAG_1 == 1 at minimum 3 days in row. Flag_1 indicates whether the order contained any specific goods. And create new table with customer_id and flag_2. I really don't know how to handle this problem.
Sample table:
CREATE TABLE TMP_TEST
(
CUSTOMER_ID INT,
ORDER_DATE DATE,
FLAG_1 INT
);
INSERT INTO TMP_TEST (CUSTOMER_ID, ORDER_DATE, FLAG_1)
VALUES
(001, '2020-04-01', 0),
(001, '2020-04-02', 1),
(001, '2020-04-03', 1),
(001, '2020-04-04', 1),
(001, '2020-04-05', 1),
(001, '2020-04-06', 0),
(001, '2020-04-07', 0),
(001, '2020-04-08', 0),
(001, '2020-04-09', 1),
(002, '2020-04-10', 1),
(002, '2020-04-11', 0),
(002, '2020-04-12', 0),
(002, '2020-04-13', 1),
(002, '2020-04-14', 1),
(002, '2020-04-15', 0),
(002, '2020-04-16', 1),
(002, '2020-04-17', 1);
Expected output table:
CUSTOMER_ID FLAG_2
001 1
002 0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
也许这会有所帮助:
Maybe this can be help:
使用count_if也有效:
using COUNT_IF also works:
雪花支持
要查找3个或更多事件,模式为
模式(a {3,})
:output:
每组单行倒数
输出:
此解决方案的功能位于模式部分,可以在新条件下轻松扩展。例如:
此处:查找flag = 1的序列,然后是一两个出现flag = 0的事件,并以flag = 1结束。
Snowflake supports MATCH_RECOGNIZE which is the easiest way to detect advanced patterns across multiple rows:
To find 3 or more occurences the pattern is
PATTERN ( a{3,})
:Output:
Collapsing to single row per group:
Output:
The power of this solution lies at the PATTERN part which could be easily extended with new conditions. For instance:
Here: Find sequence of flag = 1, followed by one or two occurences of flag = 0 and ended by flag = 1.