如何使用PostgreSQL(Crosstab)将值与类别列的密集表获取?

发布于 2025-01-30 07:59:35 字数 4935 浏览 2 评论 0 原文

我有这个玩具示例,它使我在不同类别中分开的值稀疏。我想拥有密集的矩阵,其中所有列都单独订购。

drop table if exists temp_table;
create temp table temp_table(
    rowid int
    , category text
    , score int
    );
insert into temp_table values (0, 'cat1', 10);
insert into temp_table values (1, 'cat2', 21);
insert into temp_table values (2, 'cat3', 32);
insert into temp_table values (3, 'cat2', 23);
insert into temp_table values (4, 'cat2', 24);
insert into temp_table values (5, 'cat3', 35);
insert into temp_table values (6, 'cat1', 16);
insert into temp_table values (7, 'cat1', 17);
insert into temp_table values (8, 'cat2', 28);
insert into temp_table values (9, 'cat2', 29);

给出此临时表:

ROWID 类别 分数
0 CAT1 10
1 CAT2 21
2 CAT3 32
3 CAT2 23
4 CAT2 24
5 CAT3 35
6 CAT1 16
7 CAT1 16 7 CAT1 17
8 CAT2 28
9 CAT2 29 CAT2 29

,然后根据其类别将分数值订购为不同的列:

select "cat1", "cat2", "cat3"
from crosstab(
    $$ select rowid, category, score from temp_table $$ -- as source_sql
    , $$ select distinct category from temp_table order by category $$ -- as category_sql
 ) as (rowid int, "cat1" int, "cat2" int, "cat3" int)
 

输出:

CAT1 CAT2 CAT3
10
21
32
23
24
35
16
17
28
29,

但我希望查询的结果稠密,例如:

Cat1 Cat2 Cat3
10 21 32
16 23 35 17
24 24
28
29

也许不是PostgreSql的Crosstab,也许不是甚至正确的工具可以做到这一点,但这首先想到,因为它会产生接近我需要的结果的稀疏桌子。

I have this toy example which gives me sparse table of values separated in their different categories. I would want to have dense matrix, where all columns are individually ordered.

drop table if exists temp_table;
create temp table temp_table(
    rowid int
    , category text
    , score int
    );
insert into temp_table values (0, 'cat1', 10);
insert into temp_table values (1, 'cat2', 21);
insert into temp_table values (2, 'cat3', 32);
insert into temp_table values (3, 'cat2', 23);
insert into temp_table values (4, 'cat2', 24);
insert into temp_table values (5, 'cat3', 35);
insert into temp_table values (6, 'cat1', 16);
insert into temp_table values (7, 'cat1', 17);
insert into temp_table values (8, 'cat2', 28);
insert into temp_table values (9, 'cat2', 29);

Which gives this temporary table:

rowid category score
0 cat1 10
1 cat2 21
2 cat3 32
3 cat2 23
4 cat2 24
5 cat3 35
6 cat1 16
7 cat1 17
8 cat2 28
9 cat2 29

Then ordering score values to different columns based on their category:

select "cat1", "cat2", "cat3"
from crosstab(
    $ select rowid, category, score from temp_table $ -- as source_sql
    , $ select distinct category from temp_table order by category $ -- as category_sql
 ) as (rowid int, "cat1" int, "cat2" int, "cat3" int)
 

That outputs:

cat1 cat2 cat3
10
21
32
23
24
35
16
17
28
29

But I would want the result of the query to be dense, like:

cat1 cat2 cat3
10 21 32
16 23 35
17 24
28
29

Maybe PostgreSQL's crosstab is not even right tool to do this, but that comes to my mind first as it produces that sparse table close to the result I would need.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

感性 2025-02-06 07:59:35

这应该适用于确切的给定示例数据和预期输出。

select max(cat1), max(cat2), max(cat3) 
from crosstab(
$ select rank() over(partition by category order by rowid) as ranking, 
  rowid, 
  category, 
  score 
from temp_table 
order by rowid, category asc$ -- as source_sql
, $ select distinct category 
from temp_table 
order by category $ -- as category_sql
  ) as (ranking int, rowid int, "cat1" int, "cat2" int, "cat3" int) 
group by ranking 
order by ranking asc

您可以在此处测试解决方案 -

编辑:
对查询的改进以得出解决方案:

  1. 在源SQL查询中,我根据ROWID顺序对类别值进行了排名,该订单有助于根据您的要求“确定”期望值的顺序。

选择级别()超过(按类别订单按ROWID进行分区)作为排名,rowID,类别,从temp_table订单中得分,rowID,cate cantory asc

  1. 在外部查询中,我有效地选择了每个类别的 max()值,对于源SQL查询中获得的每个排名。

This should work for the exact given example data and expected output.

select max(cat1), max(cat2), max(cat3) 
from crosstab(
$ select rank() over(partition by category order by rowid) as ranking, 
  rowid, 
  category, 
  score 
from temp_table 
order by rowid, category asc$ -- as source_sql
, $ select distinct category 
from temp_table 
order by category $ -- as category_sql
  ) as (ranking int, rowid int, "cat1" int, "cat2" int, "cat3" int) 
group by ranking 
order by ranking asc

You can test the solution here - https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f198e40a18a282cc0d65fa6ecdf797cb

Edit:
Improvements made to your query to arrive at the solution:

  1. In the source SQL query, I have ranked the category values based on the rowid order, which helps "determining" the order of the expected values, as per your requirement.

select rank() over(partition by category order by rowid) as ranking, rowid, category, score from temp_table order by rowid, category asc
Ranking over Categories

  1. In the external query, I am effectively picking the max() values of each category, for each of the rankings as obtained in the source SQL query.
独自←快乐 2025-02-06 07:59:35
with cte as (
  select category, score, row_number() over (
    partition by category order by score
  ) as r
  from temp_table
)
  select
    sum(score) filter (where category = 'cat1') as cat1,
    sum(score) filter (where category = 'cat2') as cat2,
    sum(score) filter (where category = 'cat3') as cat3
  from cte
  group by r
  order by r
;

如果已知列的数量并且相当小,则过滤器可能比 crosstab 更好,该选项需要扩展。

with cte as (
  select category, score, row_number() over (
    partition by category order by score
  ) as r
  from temp_table
)
  select
    sum(score) filter (where category = 'cat1') as cat1,
    sum(score) filter (where category = 'cat2') as cat2,
    sum(score) filter (where category = 'cat3') as cat3
  from cte
  group by r
  order by r
;

If the number of columns is known and it is reasonably small, FILTER might be a better option than CROSSTAB, which requires an extension.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文