SQL 将行转置为列

发布于 2024-08-18 07:03:25 字数 1141 浏览 7 评论 0原文

我有一个有趣的难题,我相信可以用纯 SQL 来解决。我有类似于以下的表格:

responses:

user_id | question_id | body
----------------------------
1       | 1           | Yes
2       | 1           | Yes
1       | 2           | Yes
2       | 2           | No
1       | 3           | No
2       | 3           | No


questions:

id | body
-------------------------
1 | Do you like apples?
2 | Do you like oranges?
3 | Do you like carrots?

我想得到以下输出

user_id | Do you like apples? | Do you like oranges? | Do you like carrots?
---------------------------------------------------------------------------
1       | Yes                 | Yes                  | No
2       | Yes                 | No                   | No

我不知道会有多少问题,而且它们将是动态的,所以我不能只为每个问题编写代码。我正在使用 PostgreSQL,我相信这称为转置,但我似乎找不到任何说明在 SQL 中执行此操作的标准方法的内容。我记得在大学的数据库课上做过这件事,但那是在 MySQL 中,老实说我不记得我们是如何做到的。

我假设它将是联接和 GROUP BY 语句的组合,但我什至不知道如何开始。

有人知道该怎么做吗?非常感谢!

编辑1:我找到了一些有关使用交叉表<的信息/a> 这似乎是我想要的,但我无法理解它。更好文章的链接将不胜感激!

I have an interesting conundrum which I believe can be solved in purely SQL. I have tables similar to the following:

responses:

user_id | question_id | body
----------------------------
1       | 1           | Yes
2       | 1           | Yes
1       | 2           | Yes
2       | 2           | No
1       | 3           | No
2       | 3           | No


questions:

id | body
-------------------------
1 | Do you like apples?
2 | Do you like oranges?
3 | Do you like carrots?

and I would like to get the following output

user_id | Do you like apples? | Do you like oranges? | Do you like carrots?
---------------------------------------------------------------------------
1       | Yes                 | Yes                  | No
2       | Yes                 | No                   | No

I don't know how many questions there will be, and they will be dynamic, so I can't just code for every question. I am using PostgreSQL and I believe this is called transposition, but I can't seem to find anything that says the standard way of doing this in SQL. I remember doing this in my database class back in college, but it was in MySQL and I honestly don't remember how we did it.

I'm assuming it will be a combination of joins and a GROUP BY statement, but I can't even figure out how to start.

Anybody know how to do this? Thanks very much!

Edit 1: I found some information about using a crosstab which seems to be what I want, but I'm having trouble making sense of it. Links to better articles would be greatly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

风流物 2024-08-25 07:03:25

用途:

  SELECT r.user_id,
         MAX(CASE WHEN r.question_id = 1 THEN r.body ELSE NULL END) AS "Do you like apples?",
         MAX(CASE WHEN r.question_id = 2 THEN r.body ELSE NULL END) AS "Do you like oranges?",
         MAX(CASE WHEN r.question_id = 3 THEN r.body ELSE NULL END) AS "Do you like carrots?"
    FROM RESPONSES r
    JOIN QUESTIONS q ON q.id = r.question_id
GROUP BY r.user_id

这是标准的透视查询,因为您要将数据从行“透视”为列数据。

Use:

  SELECT r.user_id,
         MAX(CASE WHEN r.question_id = 1 THEN r.body ELSE NULL END) AS "Do you like apples?",
         MAX(CASE WHEN r.question_id = 2 THEN r.body ELSE NULL END) AS "Do you like oranges?",
         MAX(CASE WHEN r.question_id = 3 THEN r.body ELSE NULL END) AS "Do you like carrots?"
    FROM RESPONSES r
    JOIN QUESTIONS q ON q.id = r.question_id
GROUP BY r.user_id

This is a standard pivot query, because you are "pivoting" the data from rows to columnar data.

南汐寒笙箫 2024-08-25 07:03:25

我实现了一个真正的动态函数来处理这个问题,而无需对任何特定类别的答案进行硬编码或使用外部模块/扩展。它还可以完全控制列排序并支持多个键和类/属性列。

您可以在这里找到它: https://github.com/jumpstarter-io/colpivot

示例解决了这个特定的问题:

begin;

create temporary table responses (
    user_id integer,
    question_id integer,
    body text
) on commit drop;

create temporary table questions (
    id integer,
    body text
) on commit drop;

insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');

select colpivot('_output', $
    select r.user_id, q.body q, r.body a from responses r
        join questions q on q.id = r.question_id
$, array['user_id'], array['q'], '#.a', null);

select * from _output;

rollback;

输出:

 user_id | 'Do you like apples?' | 'Do you like carrots?' | 'Do you like oranges?' 
---------+-----------------------+------------------------+------------------------
       1 | Yes                   | No                     | Yes
       2 | Yes                   | No                     | No

I implemented a truly dynamic function to handle this problem without having to hard code any specific class of answers or use external modules/extensions. It also gives full control over column ordering and supports multiple key and class/attribute columns.

You can find it here: https://github.com/jumpstarter-io/colpivot

Example that solves this particular problem:

begin;

create temporary table responses (
    user_id integer,
    question_id integer,
    body text
) on commit drop;

create temporary table questions (
    id integer,
    body text
) on commit drop;

insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');

select colpivot('_output', $
    select r.user_id, q.body q, r.body a from responses r
        join questions q on q.id = r.question_id
$, array['user_id'], array['q'], '#.a', null);

select * from _output;

rollback;

This outputs:

 user_id | 'Do you like apples?' | 'Do you like carrots?' | 'Do you like oranges?' 
---------+-----------------------+------------------------+------------------------
       1 | Yes                   | No                     | Yes
       2 | Yes                   | No                     | No
撩发小公举 2024-08-25 07:03:25

您可以使用 crosstab 函数以这种方式解决此示例

drop table if exists responses;
create table responses (
user_id integer,
question_id integer,
body text
);

drop table if exists questions;
create table questions (
id integer,
body text
);

insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');

select * from crosstab('select responses.user_id, questions.body, responses.body from responses, questions where questions.id = responses.question_id order by user_id') as ct(userid integer, "Do you like apples?" text, "Do you like oranges?" text, "Do you like carrots?" text);

首先,您必须安装 tablefunc 扩展。从 9.1 版本开始,您可以使用创建扩展来完成此操作:

CREATE EXTENSION tablefunc;

You can solve this example with the crosstab function in this way

drop table if exists responses;
create table responses (
user_id integer,
question_id integer,
body text
);

drop table if exists questions;
create table questions (
id integer,
body text
);

insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');

select * from crosstab('select responses.user_id, questions.body, responses.body from responses, questions where questions.id = responses.question_id order by user_id') as ct(userid integer, "Do you like apples?" text, "Do you like oranges?" text, "Do you like carrots?" text);

First, you must install tablefunc extension. Since 9.1 version you can do it using create extension:

CREATE EXTENSION tablefunc;

我编写了一个函数来生成动态查询。
它为交叉表生成 sql 并创建一个视图(如果存在,则首先删除它)。
然后您可以从视图中进行选择以获得结果。

这是函数:

CREATE OR REPLACE FUNCTION public.c_crosstab (
  eavsql_inarg varchar,
  resview varchar,
  rowid varchar,
  colid varchar,
  val varchar,
  agr varchar
)
RETURNS void AS
$body$
DECLARE
    casesql varchar;
    dynsql varchar;    
    r record;
BEGIN   
 dynsql='';

 for r in 
      select * from pg_views where lower(viewname) = lower(resview)
  loop
      execute 'DROP VIEW ' || resview;
  end loop;   

 casesql='SELECT DISTINCT ' || colid || ' AS v from (' || eavsql_inarg || ') eav ORDER BY ' || colid;
 FOR r IN EXECUTE casesql Loop
    dynsql = dynsql || ', ' || agr || '(CASE WHEN ' || colid || '=''' || r.v || ''' THEN ' || val || ' ELSE NULL END) AS ' || agr || '_' || r.v;
 END LOOP;
 dynsql = 'CREATE VIEW ' || resview || ' AS SELECT ' || rowid || dynsql || ' from (' || eavsql_inarg || ') eav GROUP BY ' || rowid;
 RAISE NOTICE 'dynsql %1', dynsql; 
 EXECUTE dynsql;
END

$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

这是我如何使用它:

SELECT c_crosstab('query_txt', 'view_name', 'entity_column_name', 'attribute_column_name', 'value_column_name', 'first');

示例:
你奔跑的拳头:

SELECT c_crosstab('Select * from table', 'ct_view', 'usr_id', 'question_id', 'response_value', 'first');

比:

Select * from ct_view;

I wrote a function to generate the dynamic query.
It generates the sql for the crosstab and creates a view (drops it first if it exists).
You can than select from the view to get your results.

Here is the function:

CREATE OR REPLACE FUNCTION public.c_crosstab (
  eavsql_inarg varchar,
  resview varchar,
  rowid varchar,
  colid varchar,
  val varchar,
  agr varchar
)
RETURNS void AS
$body$
DECLARE
    casesql varchar;
    dynsql varchar;    
    r record;
BEGIN   
 dynsql='';

 for r in 
      select * from pg_views where lower(viewname) = lower(resview)
  loop
      execute 'DROP VIEW ' || resview;
  end loop;   

 casesql='SELECT DISTINCT ' || colid || ' AS v from (' || eavsql_inarg || ') eav ORDER BY ' || colid;
 FOR r IN EXECUTE casesql Loop
    dynsql = dynsql || ', ' || agr || '(CASE WHEN ' || colid || '=''' || r.v || ''' THEN ' || val || ' ELSE NULL END) AS ' || agr || '_' || r.v;
 END LOOP;
 dynsql = 'CREATE VIEW ' || resview || ' AS SELECT ' || rowid || dynsql || ' from (' || eavsql_inarg || ') eav GROUP BY ' || rowid;
 RAISE NOTICE 'dynsql %1', dynsql; 
 EXECUTE dynsql;
END

$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

And here is how I use it:

SELECT c_crosstab('query_txt', 'view_name', 'entity_column_name', 'attribute_column_name', 'value_column_name', 'first');

Example:
Fist you run:

SELECT c_crosstab('Select * from table', 'ct_view', 'usr_id', 'question_id', 'response_value', 'first');

Than:

Select * from ct_view;
疧_╮線 2024-08-25 07:03:25

contrib/tablefunc/ 中有一个这样的示例。

There is an example of this in contrib/tablefunc/.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文