添加新列(维度)时,不同的计数总数不正确
我有一个查询,我需要计算一个字段来提出“交货停止”的值,如下所示:
SELECT
[EAR_START_DTTM_H],
cast(year([EAR_START_DTTM_H]) as varchar(4)) + right('0' + cast(month([EAR_START_DTTM_H]) as varchar(2)), 2) as calmonth,
count(distinct[RATING_GROUP]) as delivery_stops
FROM [raw_abdc_operations].[sapbw_emanifest_data]
where [EAR_START_DTTM_H] >= '2022-05-01' and [EAR_START_DTTM_H] < '2022-05-04' and org = '018'
group by [EAR_START_DTTM_H]
order by [EAR_START_DTTM_H]
结果1:
EAR_START_DTTM_H calmonth delivery_stops
2022-05-02T00:00:00 202205 1656
2022-05-03T00:00:00 202205 1688
**Total 3,344**
但是,当我添加另一个维度或另一个维度或列(即prod cat),总数现在是过度说明的,并且该字段被多次列出:
SELECT
[EAR_START_DTTM_H],
cast(year([EAR_START_DTTM_H]) as varchar(4)) + right('0' + cast(month([EAR_START_DTTM_H]) as varchar(2)), 2) as calmonth,
[PRODUCT_CAT],
count(distinct[RATING_GROUP]) as delivery_stops
FROM [raw_abdc_operations].[sapbw_emanifest_data]
where [EAR_START_DTTM_H] >= '2022-05-01' and [EAR_START_DTTM_H] < '2022-05-04' and org = '018'
group by [EAR_START_DTTM_H],[PRODUCT_CAT]
order by [EAR_START_DTTM_H],[PRODUCT_CAT]
结果2:
EAR_START_DTTM_H calmonth PRODUCT_CAT delivery_stops
2022-05-02T00:00:00 202205 COLTOTL 1082
2022-05-02T00:00:00 202205 COLTOTS 742
2022-05-02T00:00:00 202205 DRPPKG 1031
2022-05-02T00:00:00 202205 LTOTE 1346
2022-05-02T00:00:00 202205 NC_CS 71
2022-05-02T00:00:00 202205 STOTE 1618
2022-05-03T00:00:00 202205 COLTOTL 1072
2022-05-03T00:00:00 202205 COLTOTS 816
2022-05-03T00:00:00 202205 DRPPKG 998
2022-05-03T00:00:00 202205 LTOTE 1392
2022-05-03T00:00:00 202205 NC_CS 69
2022-05-03T00:00:00 202205 STOTE 1641
**Total 11,878**
为什么我的结果会被炸毁而似乎是双倍的 - 三重?我在这里想念什么?任何帮助将不胜感激!谢谢你!
I have a query where I need to count a field to come up with a value for 'delivery stops' as shown below:
SELECT
[EAR_START_DTTM_H],
cast(year([EAR_START_DTTM_H]) as varchar(4)) + right('0' + cast(month([EAR_START_DTTM_H]) as varchar(2)), 2) as calmonth,
count(distinct[RATING_GROUP]) as delivery_stops
FROM [raw_abdc_operations].[sapbw_emanifest_data]
where [EAR_START_DTTM_H] >= '2022-05-01' and [EAR_START_DTTM_H] < '2022-05-04' and org = '018'
group by [EAR_START_DTTM_H]
order by [EAR_START_DTTM_H]
RESULTS 1:
EAR_START_DTTM_H calmonth delivery_stops
2022-05-02T00:00:00 202205 1656
2022-05-03T00:00:00 202205 1688
**Total 3,344**
However, when I add another dimension or column (ie. Prod Cat), the total is now way over-stated and the field is listed multiple times:
SELECT
[EAR_START_DTTM_H],
cast(year([EAR_START_DTTM_H]) as varchar(4)) + right('0' + cast(month([EAR_START_DTTM_H]) as varchar(2)), 2) as calmonth,
[PRODUCT_CAT],
count(distinct[RATING_GROUP]) as delivery_stops
FROM [raw_abdc_operations].[sapbw_emanifest_data]
where [EAR_START_DTTM_H] >= '2022-05-01' and [EAR_START_DTTM_H] < '2022-05-04' and org = '018'
group by [EAR_START_DTTM_H],[PRODUCT_CAT]
order by [EAR_START_DTTM_H],[PRODUCT_CAT]
RESULTS 2:
EAR_START_DTTM_H calmonth PRODUCT_CAT delivery_stops
2022-05-02T00:00:00 202205 COLTOTL 1082
2022-05-02T00:00:00 202205 COLTOTS 742
2022-05-02T00:00:00 202205 DRPPKG 1031
2022-05-02T00:00:00 202205 LTOTE 1346
2022-05-02T00:00:00 202205 NC_CS 71
2022-05-02T00:00:00 202205 STOTE 1618
2022-05-03T00:00:00 202205 COLTOTL 1072
2022-05-03T00:00:00 202205 COLTOTS 816
2022-05-03T00:00:00 202205 DRPPKG 998
2022-05-03T00:00:00 202205 LTOTE 1392
2022-05-03T00:00:00 202205 NC_CS 69
2022-05-03T00:00:00 202205 STOTE 1641
**Total 11,878**
Why are my results being blown up and seemingly double-triple counted? What am I missing here? Any help would be much appreciated! Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题不是
COUNT(DISTILL)
问题是您的理解。您无法计算不同的值,并期望不同分组集的相同结果。过于简化的示例:
现在采取以下查询:
返回返回以下内容:
这是正确 列中有两个不同的值
value
;'ABC'
和'def'
。现在让我们进行以下查询:
返回以下结果:
这也是正确;对于
分组
的每个独特值,有2个value
的值('abc'
和def'
再次)。tl; dr:
sum
count> count(dimption)
的一个分组 not 等于sum
计数(不同)
用于另一个分组集。如果您想要一致的计数,请使用计数
。The problem isn't
COUNT (DISTINCT)
the problem is your understanding. You cannot count distinct values and expect the same results for different grouping sets.Take this overly simplified example:
Now take the following query:
This returns returns the following:
This is correct; there are two distinct values in the column
Value
;'abc'
and'def'
.Now lets do the following query:
This returns the following result:
This too is correct; for each distinct value of
Grouping
there are 2DISTINCT
values forValue
('abc'
and'def'
again).TL;DR: The
SUM
of aCOUNT(DISTINCT)
for one grouping does not equal theSUM
of aCOUNT(DISTINCT)
for another grouping set. If you want a consistent count, useCOUNT
.