在这种情况下,如何在不手动列出所有类别的情况下对表(BigQuery)进行透视?

发布于 2025-01-17 02:46:24 字数 902 浏览 0 评论 0原文

我有一个表,其架构大致如下:

user_id | segment_id | day

segment_id 的值范围不小:从 1 到 70。 对于,它是 0-2。

理想情况下,我想将此表转换为

user_id | segment_1_day_1_count | segment_2_day_1_count | ... segment_70_day_1_count | ... | segment_70_day_3_count | ... segment_1_count | segment_2_count | segment_3_count | day_1_count | day_2_count | day_3_count | total_count

粗略地说,对于每个 user_id 我想获取所有细分组合的计数:

  • 按分段
  • 按天
  • 按分段和按天
  • 总计

这看起来非常像旋转,但是我不确定是否有一种方法可以基于多个列进行旋转。

我当前的尝试如下:

SELECT
  user_id,
  segment_id,
  day,
  COUNT(*) OVER (PARTITION BY user_id, segment_id),
  COUNT(*) OVER (PARTITION BY user_id, day),
  COUNT(*) OVER (PARTITION BY user_id, segment_id, day),
  COUNT(*) OVER (PARTITION BY user_id)
FROM some_table

这是我需要的数据,但不是所需的格式。

I have a table where the schema is roughly the following:

user_id | segment_id | day

The range of values for segment_id is not small: from 1 to 70.
For day, it's 0-2.

Ideally, I would like to transform this table to

user_id | segment_1_day_1_count | segment_2_day_1_count | ... segment_70_day_1_count | ... | segment_70_day_3_count | ... segment_1_count | segment_2_count | segment_3_count | day_1_count | day_2_count | day_3_count | total_count

Roughly speaking, for each user_id I want to get counts for all breakdown combinations:

  • by segment
  • by day
  • by segment and day
  • total

That looks very much like pivoting, but I am not sure there is a way to pivot based on several columns.

My current attempt is the following:

SELECT
  user_id,
  segment_id,
  day,
  COUNT(*) OVER (PARTITION BY user_id, segment_id),
  COUNT(*) OVER (PARTITION BY user_id, day),
  COUNT(*) OVER (PARTITION BY user_id, segment_id, day),
  COUNT(*) OVER (PARTITION BY user_id)
FROM some_table

This is the data I need, but not in the desired format.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

池予 2025-01-24 02:46:25

请考虑以下方法

execute immediate (select '''select * from your_table 
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || 
string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) || '))'
from (select distinct segment_id from your_table),
(select distinct day from your_table));              

如果应用于类似于您所描述的虚拟数据,

- 输出为在此处输入图像描述

Consider below approach

execute immediate (select '''select * from your_table 
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || 
string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) || '))'
from (select distinct segment_id from your_table),
(select distinct day from your_table));              

if applied to dummy data similar to what you described - output is

enter image description here

岁月打碎记忆 2025-01-24 02:46:25

是否还可以分别添加段、天和总计的计数?

当然。只是[相对]简单地扩展我最初提出的解决方案,

execute immediate (select '''
  select * from (
    select * 
    from your_table 
    pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || list1 || '''))
  ) 
  join (
    select * 
    from (select * except(day) from your_table) 
    pivot (count(*) for 'segment_' || segment_id || '_count' in (''' || list2 || '''))
  )
  using(user_id)
  join (
    select * 
    from (select * except(segment_id) from your_table) 
    pivot (count(*) for 'day_' || day || '_count' in (''' || list3 || '''))
  )
  using(user_id)
  join (
    select user_id, count(*) total 
    from your_table
    group by user_id
  )
  using(user_id)
'''
from (
  select string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) list1
  from (select distinct segment_id from your_table), (select distinct day from your_table)
),(
  select string_agg('"segment_' || segment_id || '_count"', ',' order by segment_id) list2
  from (select distinct segment_id from your_table)
),(
  select string_agg('"day_' || day || '_count"', ',' order by day) list3
  from (select distinct day from your_table)
) 
)   

如果应用于类似于您所描述的虚拟数据 - 输出为

在此处输入图像描述

is it possible to also add counts for segment, day and total separately?

Sure. just [relatively] simple extension of my originally proposed solution

execute immediate (select '''
  select * from (
    select * 
    from your_table 
    pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || list1 || '''))
  ) 
  join (
    select * 
    from (select * except(day) from your_table) 
    pivot (count(*) for 'segment_' || segment_id || '_count' in (''' || list2 || '''))
  )
  using(user_id)
  join (
    select * 
    from (select * except(segment_id) from your_table) 
    pivot (count(*) for 'day_' || day || '_count' in (''' || list3 || '''))
  )
  using(user_id)
  join (
    select user_id, count(*) total 
    from your_table
    group by user_id
  )
  using(user_id)
'''
from (
  select string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) list1
  from (select distinct segment_id from your_table), (select distinct day from your_table)
),(
  select string_agg('"segment_' || segment_id || '_count"', ',' order by segment_id) list2
  from (select distinct segment_id from your_table)
),(
  select string_agg('"day_' || day || '_count"', ',' order by day) list3
  from (select distinct day from your_table)
) 
)   

if applied to dummy data similar to what you described - output is

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文