如何在Hiveql中与组中包括0结果计数

发布于 2025-01-24 04:10:22 字数 557 浏览 0 评论 0原文

我是Hive的新手。我想在结果中包含0行,

我有一个表格

- - - - - - - - - - - - - - -
data_source_id,  part_hour, ...
- - - - - - - - - - - - - - -

是我的查询

SELECT data_source_id, COUNT(*) as count 
FROM data_source 
WHERE part_hour = 2022041618 
GROUP BY data_source_id;

示例结果,

data_source_id, count
12            , 35
13            , 36

但是我所需的结果是

data_source_id, count
12            , 35
13            , 36
15.           , 0
        

如何在结果中获得0?

I'm a newbie in HIVE. I want to include 0 rows in results

I have one table like

- - - - - - - - - - - - - - -
data_source_id,  part_hour, ...
- - - - - - - - - - - - - - -

this is my query

SELECT data_source_id, COUNT(*) as count 
FROM data_source 
WHERE part_hour = 2022041618 
GROUP BY data_source_id;

example result is

data_source_id, count
12            , 35
13            , 36

but my desired result is

data_source_id, count
12            , 35
13            , 36
15.           , 0
        

How can I get a 0 in results?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

氛圍 2025-01-31 04:10:22

您可以执行此操作,但是您需要删除子句的位置。

SELECT data_source_id, 
SUM(case when part_hour = 2022041618 then 1 else 0 end ) as count 
FROM data_source  
GROUP BY data_source_id;

您也可以使用自我加入。

编辑 - 根据您的需要,我更改了SQL。如果您在零件小时和data_source_id上​​有分区,则SQL应该很快。

select distinct d.data_source_id, nvl(rs.count,0) 
FROM 
(select distinct data_source_id from data_source) d
left join (SELECT data_source_id, COUNT(*) as count 
FROM data_source 
WHERE part_hour = 2022041618 
GROUP BY data_source_id) rs on rs.data_source_id=d.data_source_id;

You can do this but you need to remove where clause.

SELECT data_source_id, 
SUM(case when part_hour = 2022041618 then 1 else 0 end ) as count 
FROM data_source  
GROUP BY data_source_id;

You can also do using self join.

EDIT - I changed below SQL as per your need. If you have partitions on part hour and data_source_id, your SQL should be quick.

select distinct d.data_source_id, nvl(rs.count,0) 
FROM 
(select distinct data_source_id from data_source) d
left join (SELECT data_source_id, COUNT(*) as count 
FROM data_source 
WHERE part_hour = 2022041618 
GROUP BY data_source_id) rs on rs.data_source_id=d.data_source_id;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文