GROUP BY 或 COUNT 类似字段值 - UNPIVOT？

发布于 2024-12-17 05:29:38 字数 674 浏览 3 评论 0原文

我有一个包含测试字段的表，示例

id         | test1    | test2    | test3    | test4    | test5
+----------+----------+----------+----------+----------+----------+
12345      | P        | P        | F        | I        | P

因此对于每条记录我想知道有多少通过，失败或不完整（P，F或I）

有没有办法按值进行分组？

Pseudo：

SELECT ('P' IN (fields)) AS pass
WHERE id = 12345

我有大约 40 个测试字段，我需要以某种方式将它们组合在一起，但我真的不想编写这个超级难看的长查询。是的，我知道我应该将表重写为两个或三个单独的表，但这是另一个问题。

预期结果：

passed     | failed   | incomplete
+----------+----------+----------+
3          | 1        | 1

建议？

注意：我正在运行 PostgreSQL 7.4，是的，我们正在升级

原文

I have a table with test fields, Example

id         | test1    | test2    | test3    | test4    | test5
+----------+----------+----------+----------+----------+----------+
12345      | P        | P        | F        | I        | P

So for each record I want to know how many Pass, Failed or Incomplete (P,F or I)

Is there a way to GROUP BY value?

Pseudo:

SELECT ('P' IN (fields)) AS pass
WHERE id = 12345

I have about 40 test fields that I need to somehow group together and I really don't want to write this super ugly, long query. Yes I know I should rewrite the table into two or three separate tables but this is another problem.

Expected Results:

passed     | failed   | incomplete
+----------+----------+----------+
3          | 1        | 1

Suggestions?

Note: I'm running PostgreSQL 7.4 and yes we are upgrading

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浮华 2024-12-24 05:29:38

我可能想出了一个解决方案：

SELECT id
      ,l - length(replace(t, 'P', '')) AS nr_p
      ,l - length(replace(t, 'F', '')) AS nr_f
      ,l - length(replace(t, 'I', '')) AS nr_i
FROM   (SELECT id, test::text AS t, length(test::text) AS l  FROM test) t

该技巧的工作原理如下：

将行类型转换为其文本表示形式。
测量字符长度。
替换要计数的字符并测量长度的变化。
计算子选择中原始行的长度以供重复使用。

这要求 P、F、I 不存在于行中的其他位置。使用子选择排除任何其他可能干扰的列。

在 8.4 - 9.1 中进行测试。现在没有人再使用 PostgreSQL 7.4，你必须自己测试一下。我只使用基本功能，但我不确定在 7.4 中将行类型转换为文本是否可行。如果这不起作用，您必须手动连接所有测试列一次：

SELECT id
      ,length(t) - length(replace(t, 'P', '')) AS nr_p
      ,length(t) - length(replace(t, 'F', '')) AS nr_f
      ,length(t) - length(replace(t, 'I', '')) AS nr_i
FROM   (SELECT id, test1||test2||test3||test4 AS t FROM test) t

这要求所有列都为NOT NULL。

I may have come up with a solution:

SELECT id
      ,l - length(replace(t, 'P', '')) AS nr_p
      ,l - length(replace(t, 'F', '')) AS nr_f
      ,l - length(replace(t, 'I', '')) AS nr_i
FROM   (SELECT id, test::text AS t, length(test::text) AS l  FROM test) t

The trick works like this:

Transform the rowtype into its text representation.
Measure character-length.
Replace the character you want to count and measure the change in length.
Compute the length of the original row in the subselect for repeated use.

This requires that P, F, I are present nowhere else in the row. Use a sub-select to exclude any other columns that might interfere.

Tested in 8.4 - 9.1. Nobody uses PostgreSQL 7.4 anymore nowadays, you'll have to test yourself. I only use basic functions, but I am not sure if casting the rowtype to text is feasible in 7.4. If that doesn't work, you'll have to concatenate all test-columns once by hand:

SELECT id
      ,length(t) - length(replace(t, 'P', '')) AS nr_p
      ,length(t) - length(replace(t, 'F', '')) AS nr_f
      ,length(t) - length(replace(t, 'I', '')) AS nr_i
FROM   (SELECT id, test1||test2||test3||test4 AS t FROM test) t

This requires all columns to be NOT NULL.

回复收藏 0 原文

心如荒岛 2024-12-24 05:29:38

本质上，您需要通过测试对数据进行逆透视：

id         | test     | result   
+----------+----------+----------+
12345      | test1    | P        
12345      | test2    | P        
12345      | test3    | F        
12345      | test4    | I        
12345      | test5    | P

...

- 以便您可以根据测试结果对其进行分组。

不幸的是，PostgreSQL没有内置的pivot/unpivot功能，所以最简单的方法是这样的：

select id, 'test1' test, test1 result from mytable union all
select id, 'test2' test, test2 result from mytable union all
select id, 'test3' test, test3 result from mytable union all
select id, 'test4' test, test4 result from mytable union all
select id, 'test5' test, test5 result from mytable union all

...

还有其他方法可以解决这个问题，但是对于40列数据，这将变得非常难看。

编辑：另一种方法-

select r.result, sum(char_length(replace(replace(test1||test2||test3||test4||test5,excl1,''),excl2,'')))
from   mytable m, 
       (select 'P' result, 'F' excl1, 'I' excl2 union all
        select 'F' result, 'P' excl1, 'I' excl2 union all
        select 'I' result, 'F' excl1, 'P' excl2) r
group by r.result

Essentially, you need to unpivot your data by test:

id         | test     | result   
+----------+----------+----------+
12345      | test1    | P        
12345      | test2    | P        
12345      | test3    | F        
12345      | test4    | I        
12345      | test5    | P

...

- so that you can then group it by test result.

Unfortunately, PostgreSQL doesn't have pivot/unpivot functionality built in, so the simplest way to do this would be something like:

select id, 'test1' test, test1 result from mytable union all
select id, 'test2' test, test2 result from mytable union all
select id, 'test3' test, test3 result from mytable union all
select id, 'test4' test, test4 result from mytable union all
select id, 'test5' test, test5 result from mytable union all

...

There are other ways of approaching this, but with 40 columns of data this is going to get really ugly.

EDIT: an alternative approach -

select r.result, sum(char_length(replace(replace(test1||test2||test3||test4||test5,excl1,''),excl2,'')))
from   mytable m, 
       (select 'P' result, 'F' excl1, 'I' excl2 union all
        select 'F' result, 'P' excl1, 'I' excl2 union all
        select 'I' result, 'F' excl1, 'P' excl2) r
group by r.result

回复收藏 0 原文

老街孤人 2024-12-24 05:29:38

您可以使用辅助即时表将列转换为行，然后您就可以应用聚合函数，如下所示：

SELECT
  SUM(fields = 'P') AS passed,
  SUM(fields = 'F') AS failed,
  SUM(fields = 'I') AS incomplete
FROM (
  SELECT
    t.id,
    CASE x.idx
      WHEN 1 THEN t.test1
      WHEN 2 THEN t.test2
      WHEN 3 THEN t.test3
      WHEN 4 THEN t.test4
      WHEN 5 THEN t.test5
    END AS fields
  FROM atable t
    CROSS JOIN (
      SELECT 1 AS idx
      UNION ALL SELECT 2
      UNION ALL SELECT 3
      UNION ALL SELECT 4
      UNION ALL SELECT 5
    ) x
  WHERE t.id = 12345
) s

You could use an auxiliary on-the-fly table to turn columns into rows, then you would be able to apply aggregate functions, something like this:

SELECT
  SUM(fields = 'P') AS passed,
  SUM(fields = 'F') AS failed,
  SUM(fields = 'I') AS incomplete
FROM (
  SELECT
    t.id,
    CASE x.idx
      WHEN 1 THEN t.test1
      WHEN 2 THEN t.test2
      WHEN 3 THEN t.test3
      WHEN 4 THEN t.test4
      WHEN 5 THEN t.test5
    END AS fields
  FROM atable t
    CROSS JOIN (
      SELECT 1 AS idx
      UNION ALL SELECT 2
      UNION ALL SELECT 3
      UNION ALL SELECT 4
      UNION ALL SELECT 5
    ) x
  WHERE t.id = 12345
) s

回复收藏 0 原文

不必在意 2024-12-24 05:29:38

编辑：刚刚看到关于 7.4 的评论，我认为这不适用于那个古老的版本（unnest() 后来才出现）。如果有人认为这篇文章不值得保留，我会删除它。

采用 Erwin 的想法，使用“行表示”作为解决方案的基础，并自动动态“规范化”表格：

select id,
       sum(case when flag = 'F' then 1 else null end) as failed,
       sum(case when flag = 'P' then 1 else null end) as passed,
       sum(case when flag = 'I' then 1 else null end) as incomplete
from (
  select id, 
         unnest(string_to_array(trim(trailing ')' from substr(all_column_values,strpos(all_column_values, ',') + 1)), ',')) flag
  from (
    SELECT id,
           not_normalized::text AS all_column_values
    FROM not_normalized
  ) t1
) t2
group by id

该解决方案的核心是 Erwin 的技巧，即从完整的行中生成单个值使用强制转换 not_normalized::text。字符串函数用于去除前导 id 值及其周围的括号。

其结果将转换为数组，并使用 unnest() 函数将该数组转换为结果集。

要理解该部分，只需逐步运行内部选择即可。

然后对结果进行分组并统计相应的值。

Edit: just saw the comment about 7.4, I don't think this will work with that ancient version (unnest() came a lot later). If anyone thinks this is not worth keeping, I'll delete it.

Taking Erwin's idea to use the "row representation" as a base for the solution a bit further and automatically "normalize" the table on-the-fly:

select id,
       sum(case when flag = 'F' then 1 else null end) as failed,
       sum(case when flag = 'P' then 1 else null end) as passed,
       sum(case when flag = 'I' then 1 else null end) as incomplete
from (
  select id, 
         unnest(string_to_array(trim(trailing ')' from substr(all_column_values,strpos(all_column_values, ',') + 1)), ',')) flag
  from (
    SELECT id,
           not_normalized::text AS all_column_values
    FROM not_normalized
  ) t1
) t2
group by id

The heart of the solution is Erwin's trick to make a single value out of the complete row using the cast not_normalized::text. The string functions are applied to strip of the leading id value and the brackets around it.

The result of that is transformed into an array and that array is transformed into a result set using the unnest() function.

To understand that part, simply run the inner selects step by step.

Then the result is grouped and the corresponding values are counted.

回复收藏 0 原文

~没有更多了~