当值更改时,使用mySQL窗口函数到extirallies
每当分区内的值更改时,我正在寻找一种将数据集拆分为部分的方法。请注意,每个值的开关都是一个新的部分 - 同样,如果值与早期部分相同。
尝试使用一些虚拟数据说明:
schema(mysql v8.0)
CREATE TABLE test (
id INT AUTO_INCREMENT PRIMARY KEY,
group_id INT,
date DATE,
value VARCHAR(255)
);
INSERT INTO test (group_id, date, value) VALUES
(2, '2022-03-12', 'D'),
(2, '2022-03-15', 'C'),
(1, '2022-03-27', 'B'),
(1, '2022-02-01', 'A'),
(2, '2022-03-10', 'D'),
(1, '2022-05-01', 'C'),
(2, '2022-01-19', 'C'),
(1, '2022-01-18', 'A'),
(2, '2022-04-12', 'D'),
(1, '2022-04-08', 'A');
查询#1
SELECT
id,
group_id,
date,
value,
'' as section # This is the one that I don't know how to calculate
FROM test ORDER BY group_id, date;
我想要的结果(该部分未正确计算):
ID | Group_ID | 日期 | 值 | 部分 |
---|---|---|---|---|
第8 | 1 | 2022-01-18 | A | 1 |
4 | 1 | 2022-02-01 | A | 1 |
3 | 1 | 2022-03-27 | B | 2 |
10 | 1 | 2022-04-08 | A | 3 |
6 | 1 | 2022-05-05-01 | C | 4 |
7 | 2 | 2022-01--01- 19 | C | 5 |
5 | 2 | 2022-03-10 | D | 6 |
1 | 2 | 2022-03-12 | D | 6 |
2 | 2 | 2022-03-15 | C | 7 |
9 | 2 | 2022-04-12 | D | 8 |
请注意我的主要问题是我不能通过(group_id,value,value>)
))由于每当值更改时,这是一个新部分,无论是否将 Back 更改为早期值(例如,请参见值a
)。
I am looking for a way to split a dataset into sections whenever a value change inside a partition. Note that each switch of value is a new section - also if the value switches back to the same as in an earlier section.
Trying to illustrate using some dummy data:
Schema (MySQL v8.0)
CREATE TABLE test (
id INT AUTO_INCREMENT PRIMARY KEY,
group_id INT,
date DATE,
value VARCHAR(255)
);
INSERT INTO test (group_id, date, value) VALUES
(2, '2022-03-12', 'D'),
(2, '2022-03-15', 'C'),
(1, '2022-03-27', 'B'),
(1, '2022-02-01', 'A'),
(2, '2022-03-10', 'D'),
(1, '2022-05-01', 'C'),
(2, '2022-01-19', 'C'),
(1, '2022-01-18', 'A'),
(2, '2022-04-12', 'D'),
(1, '2022-04-08', 'A');
Query #1
SELECT
id,
group_id,
date,
value,
'' as section # This is the one that I don't know how to calculate
FROM test ORDER BY group_id, date;
Results I want (the section is not calculated correctly):
id | group_id | date | value | section |
---|---|---|---|---|
8 | 1 | 2022-01-18 | A | 1 |
4 | 1 | 2022-02-01 | A | 1 |
3 | 1 | 2022-03-27 | B | 2 |
10 | 1 | 2022-04-08 | A | 3 |
6 | 1 | 2022-05-01 | C | 4 |
7 | 2 | 2022-01-19 | C | 5 |
5 | 2 | 2022-03-10 | D | 6 |
1 | 2 | 2022-03-12 | D | 6 |
2 | 2 | 2022-03-15 | C | 7 |
9 | 2 | 2022-04-12 | D | 8 |
Notice that my main problem is that I cannot partition by (group_id, value)
since it's a new section whenever the value changes, regardless if it changes back to an earlier value (see value A
for example).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您将获得一个数字,可以通过减去距离该group_id的总数到目前为止的总数的次数来区分Group_ID的值的时间。一点点想法会向您展示此值在相同值的系列中始终相同,并且与在不同时间出现的相同值始终不同。
从该数字中,您可以计算您的顺序截面号。可能有一种直接执行此操作的方法(以较少的子查询),但是我必须使用一个中间步骤来获得group_id的特定值运行的日期。
You get a number that distinguishes which time a value has occurred for a group_id by subtracting the number of times that value has occurred for the group_id so far from the total occurrences of that group_id so far; a little thought will show you this value will always be the same within a series of the same value and always different from that same value appearing at a different time.
From that number, you can calculate your sequential section number. There may be a way to do that directly (with one fewer subquery), but I had to use an intermediate step of getting the date that a particular run of values for a group_id started.
fiddle