在这种情况下，如何在不手动列出所有类别的情况下对表（BigQuery）进行透视？

发布于 2025-01-17 02:46:24 字数 902 浏览 0 评论 0原文

我有一个表，其架构大致如下：

user_id | segment_id | day

segment_id 的值范围不小：从 1 到 70。对于天，它是 0-2。

理想情况下，我想将此表转换为

user_id | segment_1_day_1_count | segment_2_day_1_count | ... segment_70_day_1_count | ... | segment_70_day_3_count | ... segment_1_count | segment_2_count | segment_3_count | day_1_count | day_2_count | day_3_count | total_count

粗略地说，对于每个 user_id 我想获取所有细分组合的计数：

按分段
按天
按分段和按天
总计

这看起来非常像旋转，但是我不确定是否有一种方法可以基于多个列进行旋转。

我当前的尝试如下：

SELECT
  user_id,
  segment_id,
  day,
  COUNT(*) OVER (PARTITION BY user_id, segment_id),
  COUNT(*) OVER (PARTITION BY user_id, day),
  COUNT(*) OVER (PARTITION BY user_id, segment_id, day),
  COUNT(*) OVER (PARTITION BY user_id)
FROM some_table

这是我需要的数据，但不是所需的格式。

原文

I have a table where the schema is roughly the following:

user_id | segment_id | day

The range of values for segment_id is not small: from 1 to 70.
For day, it's 0-2.

Ideally, I would like to transform this table to

user_id | segment_1_day_1_count | segment_2_day_1_count | ... segment_70_day_1_count | ... | segment_70_day_3_count | ... segment_1_count | segment_2_count | segment_3_count | day_1_count | day_2_count | day_3_count | total_count

Roughly speaking, for each user_id I want to get counts for all breakdown combinations:

by segment
by day
by segment and day
total

That looks very much like pivoting, but I am not sure there is a way to pivot based on several columns.

My current attempt is the following:

SELECT
  user_id,
  segment_id,
  day,
  COUNT(*) OVER (PARTITION BY user_id, segment_id),
  COUNT(*) OVER (PARTITION BY user_id, day),
  COUNT(*) OVER (PARTITION BY user_id, segment_id, day),
  COUNT(*) OVER (PARTITION BY user_id)
FROM some_table

This is the data I need, but not in the desired format.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

池予 2025-01-24 02:46:25

请考虑以下方法

execute immediate (select '''select * from your_table 
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || 
string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) || '))'
from (select distinct segment_id from your_table),
(select distinct day from your_table));

如果应用于类似于您所描述的虚拟数据，

- 输出为

Consider below approach

execute immediate (select '''select * from your_table 
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || 
string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) || '))'
from (select distinct segment_id from your_table),
(select distinct day from your_table));

if applied to dummy data similar to what you described - output is

回复收藏 0 原文

岁月打碎记忆 2025-01-24 02:46:25

是否还可以分别添加段、天和总计的计数？

当然。只是[相对]简单地扩展我最初提出的解决方案，

execute immediate (select '''
  select * from (
    select * 
    from your_table 
    pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || list1 || '''))
  ) 
  join (
    select * 
    from (select * except(day) from your_table) 
    pivot (count(*) for 'segment_' || segment_id || '_count' in (''' || list2 || '''))
  )
  using(user_id)
  join (
    select * 
    from (select * except(segment_id) from your_table) 
    pivot (count(*) for 'day_' || day || '_count' in (''' || list3 || '''))
  )
  using(user_id)
  join (
    select user_id, count(*) total 
    from your_table
    group by user_id
  )
  using(user_id)
'''
from (
  select string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) list1
  from (select distinct segment_id from your_table), (select distinct day from your_table)
),(
  select string_agg('"segment_' || segment_id || '_count"', ',' order by segment_id) list2
  from (select distinct segment_id from your_table)
),(
  select string_agg('"day_' || day || '_count"', ',' order by day) list3
  from (select distinct day from your_table)
) 
)

如果应用于类似于您所描述的虚拟数据 - 输出为

is it possible to also add counts for segment, day and total separately?

Sure. just [relatively] simple extension of my originally proposed solution

execute immediate (select '''
  select * from (
    select * 
    from your_table 
    pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || list1 || '''))
  ) 
  join (
    select * 
    from (select * except(day) from your_table) 
    pivot (count(*) for 'segment_' || segment_id || '_count' in (''' || list2 || '''))
  )
  using(user_id)
  join (
    select * 
    from (select * except(segment_id) from your_table) 
    pivot (count(*) for 'day_' || day || '_count' in (''' || list3 || '''))
  )
  using(user_id)
  join (
    select user_id, count(*) total 
    from your_table
    group by user_id
  )
  using(user_id)
'''
from (
  select string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) list1
  from (select distinct segment_id from your_table), (select distinct day from your_table)
),(
  select string_agg('"segment_' || segment_id || '_count"', ',' order by segment_id) list2
  from (select distinct segment_id from your_table)
),(
  select string_agg('"day_' || day || '_count"', ',' order by day) list3
  from (select distinct day from your_table)
) 
)

if applied to dummy data similar to what you described - output is

回复收藏 0 原文

~没有更多了~