PostgreSQL 交叉表查询

发布于 2024-09-05 01:08:20 字数 383 浏览 6 评论 0原文

如何在 PostgreSQL 中创建交叉表查询?例如,我有下表:

Section    Status    Count
A          Active    1
A          Inactive  2
B          Active    4
B          Inactive  5

我希望查询返回以下交叉表:

Section    Active    Inactive
A          1         2
B          4         5

How do I create crosstab queries in PostgreSQL? For example I have the following table:

Section    Status    Count
A          Active    1
A          Inactive  2
B          Active    4
B          Inactive  5

I would like the query to return the following crosstab:

Section    Active    Inactive
A          1         2
B          4         5

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

流星番茄 2024-09-12 01:08:21

Crosstab 功能可在 tablefunc 扩展下使用。您必须为数据库创建一次此扩展。

创建扩展tablefunc

您可以使用以下代码使用交叉表创建数据透视表:

create table test_Crosstab( section text,
status text,
count numeric)

insert into test_Crosstab values ( 'A','Active',1)
                ,( 'A','Inactive',2)
                ,( 'B','Active',4)
                ,( 'B','Inactive',5)

select * from crosstab(
    'select section
    ,status
    ,count
    from test_crosstab'
    )as ctab ("Section" text,"Active" numeric,"Inactive" numeric)

Crosstab function is available under the tablefunc extension. You'll have to create this extension one time for the database.

CREATE EXTENSION tablefunc;

You can use the below code to create pivot table using cross tab:

create table test_Crosstab( section text,
status text,
count numeric)

insert into test_Crosstab values ( 'A','Active',1)
                ,( 'A','Inactive',2)
                ,( 'B','Active',4)
                ,( 'B','Inactive',5)

select * from crosstab(
    'select section
    ,status
    ,count
    from test_crosstab'
    )as ctab ("Section" text,"Active" numeric,"Inactive" numeric)
萌辣 2024-09-12 01:08:20

安装附加模块tablefunc 每个数据库一次,提供函数crosstab()。从 Postgres 9.1 开始,您可以使用 CREATE EXTENSION 为此:

CREATE EXTENSION IF NOT EXISTS tablefunc;

改进的测试用例

CREATE TABLE tbl (
   section   text
 , status    text
 , ct        integer  -- "count" is a reserved word in standard SQL
);

INSERT INTO tbl VALUES 
  ('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
                    , ('C', 'Inactive', 7);  -- ('C', 'Active') is missing

简单形式 - 不适合缺少属性

crosstab(text)1 输入参数:

SELECT *
FROM   crosstab(
   'SELECT section, status, ct
    FROM   tbl
    ORDER  BY 1,2'  -- needs to be "ORDER BY 1,2" here
   ) AS ct ("Section" text, "Active" int, "Inactive" int);

返回:

 Section | Active | Inactive
---------+--------+----------
 A       |      1 |        2
 B       |      4 |        5
 C       |      7 |           -- !!
  • 不需要用于铸造和重命名。
  • 请注意 C不正确结果:第一列填充了值 7。有时,这种行为是可取的,但不适合此用例。
  • 简单形式还限制为所提供的输入查询中的恰好三列:row_name类别。没有空间容纳额外的列,就像下面的 2 参数替代方案一样。

安全表单

带有 2 输入参数的

SELECT *
FROM   crosstab(
   'SELECT section, status, ct
    FROM   tbl
    ORDER  BY 1,2'  -- could also just be "ORDER BY 1" here

  , $VALUES ('Active'::text), ('Inactive')$
   ) AS ct ("Section" text, "Active" int, "Inactive" int);

crosstab(text, text):返回:

 Section | Active | Inactive
---------+--------+----------
 A       |      1 |        2
 B       |      4 |        5
 C       |        |        7  -- !!
  • 注意 C.


  • 第二个参数可以是任何查询,每个属性返回一个,与末尾的列定义的顺序相匹配。通常您会想要从基础表中查询不同的属性,如下所示:

     '从 tbl ORDER BY 1 中选择不同的属性'
    

手册中有说明。

由于无论如何您都必须拼写出列定义列表中的所有列(预定义的 crosstabN() 变体除外),因此提供简短的说明通常会更有效在 VALUES 表达式中列出,如演示:

    $VALUES ('Active'::text), ('Inactive')$)

或者(不在手册中):

    $SELECT unnest('{Active,Inactive}'::text[])$  -- short syntax for long lists

db<>fiddle 此处

过多输入行的影响

多余的输入行的处理方式不同 - 相同(“row_name”,“category”)组合的重复行 - 上例中的 (section, status)

1 个参数表单从左到右填写可用值列。多余的值将被丢弃。
较早的输入行获胜。

2 参数形式将每个输入值分配给其专用列,覆盖任何以前的分配。
稍后输入的行获胜。

通常,您一开始就没有重复项。但如果您这样做,请根据您的要求仔细调整排序顺序 - 并记录发生的情况。
或者如果您不在乎的话,可以快速获得任意结果。只要注意一下效果就可以了。

高级示例

\crosstabview 在 psql

Postgres 9.6 中将此元命令添加到其默认交互式终端 psql。您可以运行用作第一个 crosstab() 参数的查询,并将其提供给 \crosstabview(立即或在下一步中)。 Like:

db=> SELECT section, status, ct FROM tbl \crosstabview

与上面类似的结果,但它是客户端独有的表示功能。输入行的处理方式略有不同,因此不需要 ORDER BY\crosstabview 在手册中。该页面底部有更多代码示例。

Daniel Vérité(psql 功能的作者)在 dba.SE 上的相关答案:

Install the additional module tablefunc once per database, which provides the function crosstab(). Since Postgres 9.1 you can use CREATE EXTENSION for that:

CREATE EXTENSION IF NOT EXISTS tablefunc;

Improved test case

CREATE TABLE tbl (
   section   text
 , status    text
 , ct        integer  -- "count" is a reserved word in standard SQL
);

INSERT INTO tbl VALUES 
  ('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
                    , ('C', 'Inactive', 7);  -- ('C', 'Active') is missing

Simple form - not fit for missing attributes

crosstab(text) with 1 input parameter:

SELECT *
FROM   crosstab(
   'SELECT section, status, ct
    FROM   tbl
    ORDER  BY 1,2'  -- needs to be "ORDER BY 1,2" here
   ) AS ct ("Section" text, "Active" int, "Inactive" int);

Returns:

 Section | Active | Inactive
---------+--------+----------
 A       |      1 |        2
 B       |      4 |        5
 C       |      7 |           -- !!
  • No need for casting and renaming.
  • Note the incorrect result for C: the value 7 is filled in for the first column. Sometimes, this behavior is desirable, but not for this use case.
  • The simple form is also limited to exactly three columns in the provided input query: row_name, category, value. There is no room for extra columns like in the 2-parameter alternative below.

Safe form

crosstab(text, text) with 2 input parameters:

SELECT *
FROM   crosstab(
   'SELECT section, status, ct
    FROM   tbl
    ORDER  BY 1,2'  -- could also just be "ORDER BY 1" here

  , $VALUES ('Active'::text), ('Inactive')$
   ) AS ct ("Section" text, "Active" int, "Inactive" int);

Returns:

 Section | Active | Inactive
---------+--------+----------
 A       |      1 |        2
 B       |      4 |        5
 C       |        |        7  -- !!
  • Note the correct result for C.

  • The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end. Often you will want to query distinct attributes from the underlying table like this:

      'SELECT DISTINCT attribute FROM tbl ORDER BY 1'
    

That's in the manual.

Since you have to spell out all columns in a column definition list anyway (except for pre-defined crosstabN() variants), it is typically more efficient to provide a short list in a VALUES expression like demonstrated:

    $VALUES ('Active'::text), ('Inactive')$)

Or (not in the manual):

    $SELECT unnest('{Active,Inactive}'::text[])$  -- short syntax for long lists
  • I used dollar quoting to make quoting easier.

  • You can even output columns with different data types with crosstab(text, text) - as long as the text representation of the value column is valid input for the target type. This way you might have attributes of different kind and output text, date, numeric etc. for respective attributes. There is a code example at the end of the chapter crosstab(text, text) in the manual.

db<>fiddle here

Effect of excess input rows

Excess input rows are handled differently - duplicate rows for the same ("row_name", "category") combination - (section, status) in the above example.

The 1-parameter form fills in available value columns from left to right. Excess values are discarded.
Earlier input rows win.

The 2-parameter form assigns each input value to its dedicated column, overwriting any previous assignment.
Later input rows win.

Typically, you don't have duplicates to begin with. But if you do, carefully adjust the sort order to your requirements - and document what's happening.
Or get fast arbitrary results if you don't care. Just be aware of the effect.

Advanced examples

\crosstabview in psql

Postgres 9.6 added this meta-command to its default interactive terminal psql. You can run the query you would use as first crosstab() parameter and feed it to \crosstabview (immediately or in the next step). Like:

db=> SELECT section, status, ct FROM tbl \crosstabview

Similar result as above, but it's a representation feature on the client side exclusively. Input rows are treated slightly differently, hence ORDER BY is not required. Details for \crosstabview in the manual. There are more code examples at the bottom of that page.

Related answer on dba.SE by Daniel Vérité (the author of the psql feature):

生活了然无味 2024-09-12 01:08:20
SELECT section,
       SUM(CASE status WHEN 'Active'   THEN count ELSE 0 END) AS active,  --here you pivot each status value as a separate column explicitly
       SUM(CASE status WHEN 'Inactive' THEN count ELSE 0 END) AS inactive --here you pivot each status value as a separate column explicitly

FROM t
GROUP BY section
SELECT section,
       SUM(CASE status WHEN 'Active'   THEN count ELSE 0 END) AS active,  --here you pivot each status value as a separate column explicitly
       SUM(CASE status WHEN 'Inactive' THEN count ELSE 0 END) AS inactive --here you pivot each status value as a separate column explicitly

FROM t
GROUP BY section
电影里的梦 2024-09-12 01:08:20

您可以使用 附加的 crosstab() 函数模块 tablefunc - 您必须为每个数据库安装一次。从 PostgreSQL 9.1 开始,您可以使用 CREATE EXTENSION< /a> 为此:

CREATE EXTENSION tablefunc;

在你的情况下,我相信它看起来像这样:

CREATE TABLE t (Section CHAR(1), Status VARCHAR(10), Count integer);

INSERT INTO t VALUES ('A', 'Active',   1);
INSERT INTO t VALUES ('A', 'Inactive', 2);
INSERT INTO t VALUES ('B', 'Active',   4);
INSERT INTO t VALUES ('B', 'Inactive', 5);

SELECT row_name AS Section,
       category_1::integer AS Active,
       category_2::integer AS Inactive
FROM crosstab('select section::text, status, count::text from t',2)
            AS ct (row_name text, category_1 text, category_2 text);

DB Fiddle here:

  • 一切正常: https: //dbfiddle.uk/iKCW9Uhh
  • 如果没有 CREATE EXTENSION tablefunc;,您会收到此错误: https://dbfiddle.uk/j8W1CMvI
    错误:函数交叉表(未知,整数)不存在
    第 4 行:FROM crosstab('选择部分::文本,状态,计数::文本来自...
                 ^
    提示:没有函数与给定的名称和参数类型匹配。您可能需要添加显式类型转换。
    

You can use the crosstab() function of the additional module tablefunc - which you have to install once per database. Since PostgreSQL 9.1 you can use CREATE EXTENSION for that:

CREATE EXTENSION tablefunc;

In your case, I believe it would look something like this:

CREATE TABLE t (Section CHAR(1), Status VARCHAR(10), Count integer);

INSERT INTO t VALUES ('A', 'Active',   1);
INSERT INTO t VALUES ('A', 'Inactive', 2);
INSERT INTO t VALUES ('B', 'Active',   4);
INSERT INTO t VALUES ('B', 'Inactive', 5);

SELECT row_name AS Section,
       category_1::integer AS Active,
       category_2::integer AS Inactive
FROM crosstab('select section::text, status, count::text from t',2)
            AS ct (row_name text, category_1 text, category_2 text);

DB Fiddle here:

  • Everything works: https://dbfiddle.uk/iKCW9Uhh
  • Without CREATE EXTENSION tablefunc; you get this error: https://dbfiddle.uk/j8W1CMvI
    ERROR:  function crosstab(unknown, integer) does not exist
    LINE 4: FROM crosstab('select section::text, status, count::text fro...
                 ^
    HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
    
好听的两个字的网名 2024-09-12 01:08:20

JSON聚合的解决方案:

CREATE TEMP TABLE t (
  section   text
, status    text
, ct        integer  -- don't use "count" as column name.
);

INSERT INTO t VALUES 
  ('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
                   , ('C', 'Inactive', 7); 


SELECT section,
       (obj ->> 'Active')::int AS active,
       (obj ->> 'Inactive')::int AS inactive
FROM (SELECT section, json_object_agg(status,ct) AS obj
      FROM t
      GROUP BY section
     )X

Solution with JSON aggregation:

CREATE TEMP TABLE t (
  section   text
, status    text
, ct        integer  -- don't use "count" as column name.
);

INSERT INTO t VALUES 
  ('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
                   , ('C', 'Inactive', 7); 


SELECT section,
       (obj ->> 'Active')::int AS active,
       (obj ->> 'Inactive')::int AS inactive
FROM (SELECT section, json_object_agg(status,ct) AS obj
      FROM t
      GROUP BY section
     )X
旧街凉风 2024-09-12 01:08:20

抱歉,这并不完整,因为我无法在这里测试它,但它可能会让您朝着正确的方向前进。我正在翻译我使用的类似查询的内容:

select mt.section, mt1.count as Active, mt2.count as Inactive
from mytable mt
left join (select section, count from mytable where status='Active')mt1
on mt.section = mt1.section
left join (select section, count from mytable where status='Inactive')mt2
on mt.section = mt2.section
group by mt.section,
         mt1.count,
         mt2.count
order by mt.section asc;

我正在使用的代码是:

select m.typeID, m1.highBid, m2.lowAsk, m1.highBid - m2.lowAsk as diff, 100*(m1.highBid - m2.lowAsk)/m2.lowAsk as diffPercent
from mktTrades m
   left join (select typeID,MAX(price) as highBid from mktTrades where bid=1 group by typeID)m1
   on m.typeID = m1.typeID
   left join (select typeID,MIN(price) as lowAsk  from mktTrades where bid=0 group by typeID)m2
   on m1.typeID = m2.typeID
group by m.typeID, 
         m1.highBid, 
         m2.lowAsk
order by diffPercent desc;

它将返回 typeID、最高出价和最低要价以及两者之间的差异(正差异意味着某物的购买价格可能低于其出售价格)。

Sorry this isn't complete because I can't test it here, but it may get you off in the right direction. I'm translating from something I use that makes a similar query:

select mt.section, mt1.count as Active, mt2.count as Inactive
from mytable mt
left join (select section, count from mytable where status='Active')mt1
on mt.section = mt1.section
left join (select section, count from mytable where status='Inactive')mt2
on mt.section = mt2.section
group by mt.section,
         mt1.count,
         mt2.count
order by mt.section asc;

The code I'm working from is:

select m.typeID, m1.highBid, m2.lowAsk, m1.highBid - m2.lowAsk as diff, 100*(m1.highBid - m2.lowAsk)/m2.lowAsk as diffPercent
from mktTrades m
   left join (select typeID,MAX(price) as highBid from mktTrades where bid=1 group by typeID)m1
   on m.typeID = m1.typeID
   left join (select typeID,MIN(price) as lowAsk  from mktTrades where bid=0 group by typeID)m2
   on m1.typeID = m2.typeID
group by m.typeID, 
         m1.highBid, 
         m2.lowAsk
order by diffPercent desc;

which will return a typeID, the highest price bid and the lowest price asked and the difference between the two (a positive difference would mean something could be bought for less than it can be sold).

一生独一 2024-09-12 01:08:20

我设计了一种不同的动态方法,它采用动态记录。类型(临时表,通过匿名过程构建)& JSON。这对于无法安装 tablefunc/crosstab 扩展但仍可以创建临时表或运行的最终用户可能很有用。过程的。

该示例假设所有 xtab 列都是相同类型 (INTEGER),但列数是数据驱动的 &可变参数。也就是说,JSON 聚合函数确实允许混合数据类型,因此通过使用嵌入式复合(混合)类型有创新的潜力。

如果您想静态定义rec,则其真正的内容可以减少到一步。 JSON 记录集函数内的类型(通过发出复合类型的嵌套 SELECT)。

dbfiddle.uk

https://dbfiddle.uk/N1EzugHk

There's a different dynamic method that I've devised, one that employs a dynamic rec. type (a temp table, built via an anonymous procedure) & JSON. This may be useful for an end-user who can't install the tablefunc/crosstab extension, but can still create temp tables or run anon. proc's.

The example assumes all the xtab columns are the same type (INTEGER), but the # of columns is data-driven & variadic. That said, JSON aggregate functions do allow for mixed data types, so there's potential for innovation via the use of embedded composite (mixed) types.

The real meat of it can be reduced down to one step if you want to statically define the rec. type inside the JSON recordset function (via nested SELECTs that emit a composite type).

dbfiddle.uk

https://dbfiddle.uk/N1EzugHk

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文