如何根据下一行决定列值

发布于 2025-01-16 14:43:07 字数 1783 浏览 2 评论 0原文

我正在使用 Teradata 16.20.53.13 并有一个表（最小化版本）：

CREATE TABLE TABLE_1
     (
      std_nm VARCHAR(50),      
      std_age INTEGER
     );

包含数据（最小子集）：

insert into TABLE_1 values ('abc', 31);
insert into TABLE_1 values ('abc', 36);
insert into TABLE_1 values ('abc', 35);
insert into TABLE_1 values ('xyz', 17);
insert into TABLE_1 values ('xyz', 14);

< img src="https://i.sstatic.net/baOSs.png" alt="TABLE_1">

我想要的是计数、总和和平均值std_age 仅适用于 std_nm 的最后一个唯一值。我遵循的一种方法给了我想要的结果，但在所有行中都有计数、总和和平均值：

    select t1.std_nm, 
           t1.std_age, 
           t2.name_count, 
           t2.sum_age, 
           t2.avg_age
      from TABLE_1 as t1
inner join (
              select std_nm, 
                     count(std_nm) as name_count, 
                     sum(std_age) as sum_age, 
                     avg(std_age) as avg_age
                from TABLE_1
            group by std_nm
           ) t2
        on t1.std_nm = t2.std_nm
  order by t1.std_nm;

使用上面的 SQL，我的结果如下所示：

问题： 如何获得零/空结果/空白name_count、sum_age 和 avg_age 用于除最后一个之外的 std_nm 的所有唯一值。所以我正在寻找的结果是：

我尝试在选择 name_count、sum_age 和 avg_age 时使用 CASE但我认为可能有更好/更干净的方法来做到这一点。也许通过以更智能的方式使用 join 或其他方式。我对适用于 Teradata 16 的所有选项持开放态度。

原文

I am using Teradata 16.20.53.13 and have a table (minimized version):

CREATE TABLE TABLE_1
     (
      std_nm VARCHAR(50),      
      std_age INTEGER
     );

With data (minimum subset):

insert into TABLE_1 values ('abc', 31);
insert into TABLE_1 values ('abc', 36);
insert into TABLE_1 values ('abc', 35);
insert into TABLE_1 values ('xyz', 17);
insert into TABLE_1 values ('xyz', 14);

What I want is to have count, sum and average of std_age only for last unique value of std_nm.
One approach I followed gives me desired result but with count, sum and average in all rows:

    select t1.std_nm, 
           t1.std_age, 
           t2.name_count, 
           t2.sum_age, 
           t2.avg_age
      from TABLE_1 as t1
inner join (
              select std_nm, 
                     count(std_nm) as name_count, 
                     sum(std_age) as sum_age, 
                     avg(std_age) as avg_age
                from TABLE_1
            group by std_nm
           ) t2
        on t1.std_nm = t2.std_nm
  order by t1.std_nm;

With above SQL my result looks like:

Question: How can I get result with zero/null/blank in name_count, sum_age and avg_age for all unique values of std_nm except the last one. So result I am looking for is:

I am trying to use CASE while selecting name_count, sum_age and avg_age but I think there might be a better/cleaner way to do this. Maybe by using join in smarter way or something else. I am open to all options that work on Teradata 16.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

咽泪装欢 2025-01-23 14:43:07

SELECT D.std_nm,D.std_age,D.XCOL,
 CASE 
   WHEN D.XCOL=1 THEN SUB_Q.avg_age
   ELSE NULL
 END AS AVG_AGE,
CASE
 WHEN D.XCOL=1 THEN SUB_Q.name_count 
 ELSE NULL
END AS NAME_COUNT,
CASE
 WHEN D.XCOL=1 THEN SUB_Q.sum_age
 ELSE NULL
END AS SUM_AGE
FROM 
(
   SELECT T.std_nm,T.std_age,
   ROW_NUMBER()OVER (PARTITION BY T.std_nm ORDER BY T.std_age DESC)XCOL 
   FROM TABLE_1 AS T
)D
JOIN
(
   select std_nm, 
                 count(std_nm) as name_count, 
                 sum(std_age) as sum_age, 
                 avg(std_age) as avg_age
            from TABLE_1
        group by std_nm
)SUB_Q ON D.std_nm=SUB_Q.std_nm

如果以上内容适合您，请您尝试一下

SELECT D.std_nm,D.std_age,D.XCOL,
 CASE 
   WHEN D.XCOL=1 THEN SUB_Q.avg_age
   ELSE NULL
 END AS AVG_AGE,
CASE
 WHEN D.XCOL=1 THEN SUB_Q.name_count 
 ELSE NULL
END AS NAME_COUNT,
CASE
 WHEN D.XCOL=1 THEN SUB_Q.sum_age
 ELSE NULL
END AS SUM_AGE
FROM 
(
   SELECT T.std_nm,T.std_age,
   ROW_NUMBER()OVER (PARTITION BY T.std_nm ORDER BY T.std_age DESC)XCOL 
   FROM TABLE_1 AS T
)D
JOIN
(
   select std_nm, 
                 count(std_nm) as name_count, 
                 sum(std_age) as sum_age, 
                 avg(std_age) as avg_age
            from TABLE_1
        group by std_nm
)SUB_Q ON D.std_nm=SUB_Q.std_nm

Could you please try the above if it is suitable for you

回复收藏 0 原文

乱世争霸 2025-01-23 14:43:07

@Sergey 的答案可以通过使用组聚合来避免连接来简化。这是大量的剪切和粘贴，但应该会导致解释中的一个步骤：

SELECT std_nm,std_age
  ,CASE
     WHEN Row_Number() Over (PARTITION BY std_nm ORDER BY std_age DESC)=1
     THEN Count(std_nm) Over (PARTITION BY std_nm )   
   END AS NAME_COUNT
  ,CASE
     WHEN Row_Number() Over (PARTITION BY std_nm ORDER BY std_age DESC)=1 
     THEN Sum(std_age) Over (PARTITION BY std_nm )
   END AS SUM_AGE
  ,CASE 
     WHEN Row_Number() Over (PARTITION BY std_nm ORDER BY std_age DESC)=1 
     THEN Avg(std_age) Over (PARTITION BY std_nm )
   END AS AVG_AGE
FROM TABLE_1 
;

@Sergey's answer can be simplified by using Group Aggregates to avoid the join. This is a lot of cut&paste, but should result in a single step in Explain:

SELECT std_nm,std_age
  ,CASE
     WHEN Row_Number() Over (PARTITION BY std_nm ORDER BY std_age DESC)=1
     THEN Count(std_nm) Over (PARTITION BY std_nm )   
   END AS NAME_COUNT
  ,CASE
     WHEN Row_Number() Over (PARTITION BY std_nm ORDER BY std_age DESC)=1 
     THEN Sum(std_age) Over (PARTITION BY std_nm )
   END AS SUM_AGE
  ,CASE 
     WHEN Row_Number() Over (PARTITION BY std_nm ORDER BY std_age DESC)=1 
     THEN Avg(std_age) Over (PARTITION BY std_nm )
   END AS AVG_AGE
FROM TABLE_1 
;

回复收藏 0 原文

~没有更多了~