一个人如何查询表_1之类的表(如下所示示例),以便只有最大时间段没有其他行作为时间段的子集。例如,表_1应该给出表_2中的内容。我试图使用存在各种条件使用的地方,但似乎无法提取所需的行。
我尝试了以下查询:
SELECT * FROM table_1 x WHERE NOT EXISTS (SELECT * FROM table_2 y WHERE (y.start < x.start and y.end > x.end) or (y.start < x.start and y.end <= x.end) or (y.start >= x.start and y.end < x.end) and y.symbol = x.symbol);
任何帮助将不胜感激。使用PostgreSQL。
编辑:
我减少了表_1的大小以减少长度。到最长时间,我的意思是所有不相交的时期,因此它们之间没有任何时期。即,时间段是最大的 - &nbsp;那就是第r行最大,r.start&lt; = r'.start&lt; = r'.end&lt&lt; = r.end r.end r',其中r'是表中的其他任何行,相同的符号值。
因此,AJR返回两行的原因是因为没有在2021-01-23和2021-05-08之间开始或结束的行。但是例如,表_1的前4行不会在第5行的时间段内返回。
table_2:
符号 |
启动 |
端 |
AJR |
2021-01-02 |
2021-01-01-23 |
AJR |
2021-05 -08 |
2021-06-05 |
BBB |
2021-07-17 |
2021-07-24 |
CCC |
2021-10-23 |
2021-11-20 |
符号 |
开始 |
末端 |
: |
|
table_1 -09 |
AJR |
2021-01-02 |
2021-01-16 |
AJR |
2021-01-02 |
2021-01-01-16 |
AJR |
2021-01-09 |
2021-01-23 |
AJR |
2021-01-01-09 |
2021-09 |
AJR |
2021-09 2021-01-23 2021-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01-01 -16 |
2021-01-23 |
AJR |
2021-05-08 |
2021-05-15 |
AJR |
2021-05-08 |
2021-05-22 |
AJR |
2021-05-08 |
2021-05-08 |
AJR |
|
2021-05-29 05 |
AJR |
2021-05-15 |
2021-05-22 |
AJR |
2021-05-15 |
2021-05-29 |
AJR |
2021-05-15 |
2021-06-06-05 |
AJR |
2021-05-22 |
2021-22 |
2021-2021-29 |
22 |
2021-06-05 |
AJR |
2021-05-29 |
2021-06-05 |
BBB |
2021-07-17 |
2021-07-24 |
CCC |
2021-10-23 |
2021-10-0-30 |
CCC |
2021-10-23 |
2021-11-11-11-11-13 |
CCC |
2021-10-23 |
2021-11-20 |
CCC |
2021-10-30 |
2021-11-13 |
CCC |
2021-10-30 |
2021-11-11-20 |
CCC |
2021-11-11-13 |
2021-11-20 |
How would one query a table like table_1 (example shown below) such that only the maximum time period with no other rows being a subset of the time period. For example Table_1 should give whats in table_2. I have tried to use various where conditions using Exists but I can not seem to extract the desired rows.
I have tried queries like the following:
SELECT * FROM table_1 x WHERE NOT EXISTS (SELECT * FROM table_2 y WHERE (y.start < x.start and y.end > x.end) or (y.start < x.start and y.end <= x.end) or (y.start >= x.start and y.end < x.end) and y.symbol = x.symbol);
Any help would be appreciated. Using postgresql.
Edit:
I have reduced the size of table_1 to reduce the length. By maximum time period I mean all disjoint periods such that there is no period in between them in other rows. i.e. that the time period is maximal – that is for row R to be maximal, R.start<= R'.start<=R'.end<=R.end where R' is any other row in the table with the same symbol value.
So the reason why two rows are return for AJR is because there is no rows that start or end between 2021-01-23 and 2021-05-08. But for example the first 4 rows of table_1 are not returned as they fall within the time period of row 5.
Table_2:
symbol |
start |
end |
AJR |
2021-01-02 |
2021-01-23 |
AJR |
2021-05-08 |
2021-06-05 |
BBB |
2021-07-17 |
2021-07-24 |
CCC |
2021-10-23 |
2021-11-20 |
Table_1:
symbol |
start |
end |
AJR |
2021-01-02 |
2021-01-09 |
AJR |
2021-01-02 |
2021-01-16 |
AJR |
2021-01-02 |
2021-01-23 |
AJR |
2021-01-09 |
2021-01-16 |
AJR |
2021-01-09 |
2021-01-23 |
AJR |
2021-01-16 |
2021-01-23 |
AJR |
2021-05-08 |
2021-05-15 |
AJR |
2021-05-08 |
2021-05-22 |
AJR |
2021-05-08 |
2021-05-29 |
AJR |
2021-05-08 |
2021-06-05 |
AJR |
2021-05-15 |
2021-05-22 |
AJR |
2021-05-15 |
2021-05-29 |
AJR |
2021-05-15 |
2021-06-05 |
AJR |
2021-05-22 |
2021-05-29 |
AJR |
2021-05-22 |
2021-06-05 |
AJR |
2021-05-29 |
2021-06-05 |
BBB |
2021-07-17 |
2021-07-24 |
CCC |
2021-10-23 |
2021-10-30 |
CCC |
2021-10-23 |
2021-11-13 |
CCC |
2021-10-23 |
2021-11-20 |
CCC |
2021-10-30 |
2021-11-13 |
CCC |
2021-10-30 |
2021-11-20 |
CCC |
2021-11-13 |
2021-11-20 |
发布评论
评论(1)
您可以应用一个组合所有启动的自连接到符号字段上匹配的所有末端,然后从“ start ”值中删除。最终,您可以通过在“ symbol ”和“ start ”字段上分组在“ end ”字段上汇总。
检查演示在这里。
注释:如果只有一行(因此,自连接将在最后一个“ end end ”字段中返回零值),您可以使用一个
cocece
。You could apply a self join that combines all starts to all ends that match on the symbol field, then remove from the "start" values any corresponding "end" values. Eventually you can aggregate on the "end" field by grouping on the "symbol" and "start" field.
Check the demo here.
Note: In the case that there is only one row (hence the self join will return a NULL value in the last "end" field), you can fix this using a
COALESCE
.