Vertica SQL：基于两个条件，无视行

发布于 2025-01-24 13:37:10 字数 2579 浏览 0 评论 0原文

我需要一个考虑在整个表中关注整个表的条件：

如果为ID存在0（列中），则排除所有＆gt; 0的所有内容，如果0不存在，则存在d = a的行在此之前排除所有内容。

例如（情况1），我想无视行1＆amp; 2，在示例2（情况2）中，我想无视行1,2，＆amp; 3。

载体我有：其中d＆lt; = 0或d = a），但是在情况1中，这也返回了我不想要的行NR 2。

行Nr	Id	D	A
1	1	180	78
2	1	78	78
3	1	0	78
4	1	-67	78
5	1	-121	78

行Nr	ID	D	A
1	2	180	148
2	2	171	148
3	2	170	148
4 2 148 4	2	148 148 148 148	148
5	2	-67	148
6	2	-121	148

原文

I need a where condition that considers to following for an entire table:

If a 0 exists for an ID (in column d) then exclude everything that is >0, if 0 does not exist, but exists a row where d = a then exclude everything before that..

In Example (Case 1) I want to disregard rows 1 & 2, in Example 2 (Case 2) I want to disregard rows 1,2,& 3.

Currenty I have: where d <= 0 or d = a) but in Case 1 this also returns row nr 2, which I do not want.

row nr	ID	d	a
1	1	180	78
2	1	78	78
3	1	0	78
4	1	-67	78
5	1	-121	78

row nr	ID	d	a
1	2	180	148
2	2	171	148
3	2	170	148
4	2	148	148
5	2	-67	148
6	2	-121	148

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

与风相奔跑 2025-01-31 13:37:10

这比您预期的要复杂得多。您将必须涉及带有OLAP函数的嵌套查询，以检测一个分区中的每一行（由ID的值定义）属于该分区，其中至少一个行的值为0对于d，然后在嵌套查询之外，过滤该事实，d的值为0或更高。那是情况1。

在另一种情况下，您使用相同的嵌套查询来确定仅使用 no lod 的行，在分区中为d的值为0，从那里开始，最简单的方法是使用Vertica的match（）子句来滤除由以下方式组成的行模式：d等于>;零，一个或多个以下行的发生，我在查询中描述了图案：（d_equal_a anyrow*）。

这里是：

WITH
-- YOUR INPUT, don't use in query
indata(row_nr,ID,d,a) AS (
          SELECT 1,1,180,78
UNION ALL SELECT 2,1,78,78
UNION ALL SELECT 3,1,0,78
UNION ALL SELECT 4,1,-67,78
UNION ALL SELECT 5,1,-121,78
UNION ALL SELECT 1,2,180,148
UNION ALL SELECT 2,2,171,148
UNION ALL SELECT 3,2,170,148
UNION ALL SELECT 4,2,148,148
UNION ALL SELECT 5,2,-67,148
UNION ALL SELECT 6,2,-121,148
)
-- end of your input, real query starts here, replace following comma with "WITH"
,
min_abs_d_eq_0 AS (
-- nested query with OLAP expression returning Boolean
  SELECT
    *
    , (MIN(ABS(d)) OVER (PARTITION BY id) = 0) AS min_abs_d_eq_0
  FROM indata
)
,
case1 AS (
  SELECT
    row_nr
  , id
  , d
  , a
  , 'no match clause' AS event_name -- these are based on the 
  , 0 AS pattern_id                 -- MATCH clause coming from 
  , 0 AS match_id                   -- the next CTE, "case2"
  FROM min_abs_d_eq_0
  WHERE min_abs_d_eq_0 AND d <= 0
)
,
case2 AS (
  SELECT
    row_nr
  , id
  , d
  , a
  , event_name()
  , pattern_id()
  , match_id()
  FROM min_abs_d_eq_0
  WHERE NOT min_abs_d_eq_0
  MATCH (
    PARTITION BY id ORDER BY row_nr
    DEFINE
      d_equal_a  AS d = a
    , anyrow    AS true
    PATTERN p AS (d_equal_a anyrow*)
  )
)
SELECT * FROM case1
UNION ALL
SELECT * FROM case2
ORDER BY id,row_nr;
-- out  row_nr | id |  d   |  a  |   event_name    | pattern_id | match_id 
-- out --------+----+------+-----+-----------------+------------+----------
-- out       3 |  1 |    0 |  78 | no match clause |          0 |        0
-- out       4 |  1 |  -67 |  78 | no match clause |          0 |        0
-- out       5 |  1 | -121 |  78 | no match clause |          0 |        0
-- out       4 |  2 |  148 | 148 | d_equal_a       |          1 |        1
-- out       5 |  2 |  -67 | 148 | anyrow          |          1 |        2
-- out       6 |  2 | -121 | 148 | anyrow          |          1 |        3

This becomes a bit more complex to do than what you expected. You will have to involve a nested query with an OLAP function to detect that each row in a partition ( defined by the value of id) belongs to a partition of which at least one row has a value of 0 for d, and then, outside of that nested query, filter for that fact, and the value of d being 0 or greater. That's case 1.

In the other case, you use the same nested query to ascertain that you use only rows with no row with a value of 0 for d in the partition, and from there, the easiest way is to use Vertica's MATCH() clause to filter out the pattern of rows that consists of : a row with a d equal to a; zero, one or more occurrences of any row following, which I describe, in the query, with the pattern: (d_equal_a anyrow*) .

Here goes:

WITH
-- YOUR INPUT, don't use in query
indata(row_nr,ID,d,a) AS (
          SELECT 1,1,180,78
UNION ALL SELECT 2,1,78,78
UNION ALL SELECT 3,1,0,78
UNION ALL SELECT 4,1,-67,78
UNION ALL SELECT 5,1,-121,78
UNION ALL SELECT 1,2,180,148
UNION ALL SELECT 2,2,171,148
UNION ALL SELECT 3,2,170,148
UNION ALL SELECT 4,2,148,148
UNION ALL SELECT 5,2,-67,148
UNION ALL SELECT 6,2,-121,148
)
-- end of your input, real query starts here, replace following comma with "WITH"
,
min_abs_d_eq_0 AS (
-- nested query with OLAP expression returning Boolean
  SELECT
    *
    , (MIN(ABS(d)) OVER (PARTITION BY id) = 0) AS min_abs_d_eq_0
  FROM indata
)
,
case1 AS (
  SELECT
    row_nr
  , id
  , d
  , a
  , 'no match clause' AS event_name -- these are based on the 
  , 0 AS pattern_id                 -- MATCH clause coming from 
  , 0 AS match_id                   -- the next CTE, "case2"
  FROM min_abs_d_eq_0
  WHERE min_abs_d_eq_0 AND d <= 0
)
,
case2 AS (
  SELECT
    row_nr
  , id
  , d
  , a
  , event_name()
  , pattern_id()
  , match_id()
  FROM min_abs_d_eq_0
  WHERE NOT min_abs_d_eq_0
  MATCH (
    PARTITION BY id ORDER BY row_nr
    DEFINE
      d_equal_a  AS d = a
    , anyrow    AS true
    PATTERN p AS (d_equal_a anyrow*)
  )
)
SELECT * FROM case1
UNION ALL
SELECT * FROM case2
ORDER BY id,row_nr;
-- out  row_nr | id |  d   |  a  |   event_name    | pattern_id | match_id 
-- out --------+----+------+-----+-----------------+------------+----------
-- out       3 |  1 |    0 |  78 | no match clause |          0 |        0
-- out       4 |  1 |  -67 |  78 | no match clause |          0 |        0
-- out       5 |  1 | -121 |  78 | no match clause |          0 |        0
-- out       4 |  2 |  148 | 148 | d_equal_a       |          1 |        1
-- out       5 |  2 |  -67 | 148 | anyrow          |          1 |        2
-- out       6 |  2 | -121 | 148 | anyrow          |          1 |        3

回复收藏 0 原文

~没有更多了~