根据14天的间隔创建组
我知道这是一个常见问题,但我找不到符合我的情况的东西。我有以下数据:
+---------+---------+
| user_id | view_dt |
+---------+---------+
| A | 1/1 |
+---------+---------+
| A | 1/10 |
+---------+---------+
| A | 1/14 |
+---------+---------+
| A | 1/22 |
+---------+---------+
| A | 1/23 |
+---------+---------+
| A | 1/30 |
+---------+---------+
我希望根据 14 天的间隔对这些数据进行分组。也就是说,组将是:
GROUP 1: 1/1, 1/10, 1/14
GROUP 2: 1/22, 1/23, 1/30
请注意,1/30 日期应该属于 GROUP 2,因为1/30 应与第 2 组的第一个日期 (1/22) 进行比较,而不是 (1/1)。
我遇到的问题是我自己的查询显示 1/30 属于组 3。
CREATE TABLE T (
user_id VARCHAR(20),
view_dt DATETIME
);
INSERT INTO t VALUES ('A', '2022-01-01');
INSERT INTO t VALUES ('A', '2022-01-10');
INSERT INTO t VALUES ('A', '2022-01-14');
INSERT INTO t VALUES ('A', '2022-01-22');
INSERT INTO t VALUES ('A', '2022-01-23');
INSERT INTO t VALUES ('A', '2022-01-30');
SELECT user_id,
view_dt,
DENSE_RANK() OVER(ORDER BY gr) grp
FROM (
SELECT
user_id,
view_dt,
CAST (view_dt - MIN (view_dt) OVER (PARTITION BY user_id ORDER BY view_dt) AS INT )/14 + 1 AS gr
FROM T
) x
ORDER BY user_id
理想输出
+---------+---------+-------+
| user_id | view_dt | group |
+---------+---------+-------+
| A | 1/1 | 1 |
+---------+---------+-------+
| A | 1/10 | 1 |
+---------+---------+-------+
| A | 1/14 | 1 |
+---------+---------+-------+
| A | 1/22 | 2 |
+---------+---------+-------+
| A | 1/23 | 2 |
+---------+---------+-------+
| A | 1/30 | 2 |
+---------+---------+-------+
我之前的查询的输出:
+---------+---------+-------+
| user_id | view_dt | group |
+---------+---------+-------+
| A | 1/1 | 1 |
+---------+---------+-------+
| A | 1/10 | 1 |
+---------+---------+-------+
| A | 1/14 | 1 |
+---------+---------+-------+
| A | 1/22 | 2 |
+---------+---------+-------+
| A | 1/23 | 2 |
+---------+---------+-------+
| A | 1/30 | 3** |
+---------+---------+-------+
I know this is a common question, but I couldn't find something that matches my case. I have this data:
+---------+---------+
| user_id | view_dt |
+---------+---------+
| A | 1/1 |
+---------+---------+
| A | 1/10 |
+---------+---------+
| A | 1/14 |
+---------+---------+
| A | 1/22 |
+---------+---------+
| A | 1/23 |
+---------+---------+
| A | 1/30 |
+---------+---------+
I want this data to be grouped based on a 14 day interval. That is, the groups would be:
GROUP 1: 1/1, 1/10, 1/14
GROUP 2: 1/22, 1/23, 1/30
Note that I the 1/30 date should belong to GROUP 2 since 1/30 should be compared to the first date of GROUP 2 (1/22) instead of (1/1).
The problem I am having is my own query shows 1/30 belonging to group 3.
CREATE TABLE T (
user_id VARCHAR(20),
view_dt DATETIME
);
INSERT INTO t VALUES ('A', '2022-01-01');
INSERT INTO t VALUES ('A', '2022-01-10');
INSERT INTO t VALUES ('A', '2022-01-14');
INSERT INTO t VALUES ('A', '2022-01-22');
INSERT INTO t VALUES ('A', '2022-01-23');
INSERT INTO t VALUES ('A', '2022-01-30');
SELECT user_id,
view_dt,
DENSE_RANK() OVER(ORDER BY gr) grp
FROM (
SELECT
user_id,
view_dt,
CAST (view_dt - MIN (view_dt) OVER (PARTITION BY user_id ORDER BY view_dt) AS INT )/14 + 1 AS gr
FROM T
) x
ORDER BY user_id
Ideal Output
+---------+---------+-------+
| user_id | view_dt | group |
+---------+---------+-------+
| A | 1/1 | 1 |
+---------+---------+-------+
| A | 1/10 | 1 |
+---------+---------+-------+
| A | 1/14 | 1 |
+---------+---------+-------+
| A | 1/22 | 2 |
+---------+---------+-------+
| A | 1/23 | 2 |
+---------+---------+-------+
| A | 1/30 | 2 |
+---------+---------+-------+
Output from my previous query:
+---------+---------+-------+
| user_id | view_dt | group |
+---------+---------+-------+
| A | 1/1 | 1 |
+---------+---------+-------+
| A | 1/10 | 1 |
+---------+---------+-------+
| A | 1/14 | 1 |
+---------+---------+-------+
| A | 1/22 | 2 |
+---------+---------+-------+
| A | 1/23 | 2 |
+---------+---------+-------+
| A | 1/30 | 3** |
+---------+---------+-------+
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一种选择是使用递归 CTE
注意:请避免在日期上使用算术运算符。
view_dt - MIN (view_dt)
。应该使用datediff()
请参阅坏习惯Kick :使用简写进行日期/时间操作
One option is to use Recursive CTE
Note : Please do avoid using arithmetic operator on date.
view_dt - MIN (view_dt)
. Should usedatediff()
See Bad Habits to Kick : Using shorthand with date/time operations