PL/SQL 中的表转换/字段解析

发布于 2024-08-31 05:33:28 字数 386 浏览 1 评论 0原文

我有非规范化的表，类似于

CODES

ID  | VALUE
10  | A,B,C
11  | A,B
12  | A,B,C,D,E,F
13  | R,T,D,W,W,W,W,W,S,S

The job is to conversion ，其中 VALUE 中的每个标记将生成新行。示例：

CODES_TRANS

ID  | VALUE_TRANS
10  | A
10  | B
10  | C
11  | A
11  | B

在不使用自定义 pl/sql 包（最好使用纯 SQL）的情况下，在 PL/SQL 中执行此操作的最佳方法是什么？

显而易见的解决方案是通过游标来实现它。有什么想法吗？

原文

I have de-normalized table, something like

CODES

ID  | VALUE
10  | A,B,C
11  | A,B
12  | A,B,C,D,E,F
13  | R,T,D,W,W,W,W,W,S,S

The job is to convert is where each token from VALUE will generate new row. Example:

CODES_TRANS

ID  | VALUE_TRANS
10  | A
10  | B
10  | C
11  | A
11  | B

What is the best way to do it in PL/SQL without usage of custom pl/sql packages, ideally with pure SQL?

Obvious solution is to implement it via cursors. Any ideas?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奈何桥上唱咆哮 2024-09-07 05:33:28

另一种替代方法是使用 model 子句：

SQL> select id
  2       , value
  3    from codes
  4   model
  5         return updated rows
  6         partition by (id)
  7         dimension by (-1 i)
  8         measures (value)
  9         ( value[for i from 0 to length(value[-1])-length(replace(value[-1],',')) increment 1]
 10           = regexp_substr(value[-1],'[^,]+',1,cv(i)+1)
 11         )
 12   order by id
 13       , i
 14  /

        ID VALUE
---------- -------------------
        10 A
        10 B
        10 C
        11 A
        11 B
        12 A
        12 B
        12 C
        12 D
        12 E
        12 F
        13 R
        13 T
        13 D
        13 W
        13 W
        13 W
        13 W
        13 W
        13 S
        13 S

21 rows selected.

我在此博文中为此类查询编写了最多 6 个替代方法： http://rwijk.blogspot.com/2007/11/interval-based-row- Generation.html

问候，
抢。

Another alternative is to use the model clause:

SQL> select id
  2       , value
  3    from codes
  4   model
  5         return updated rows
  6         partition by (id)
  7         dimension by (-1 i)
  8         measures (value)
  9         ( value[for i from 0 to length(value[-1])-length(replace(value[-1],',')) increment 1]
 10           = regexp_substr(value[-1],'[^,]+',1,cv(i)+1)
 11         )
 12   order by id
 13       , i
 14  /

        ID VALUE
---------- -------------------
        10 A
        10 B
        10 C
        11 A
        11 B
        12 A
        12 B
        12 C
        12 D
        12 E
        12 F
        13 R
        13 T
        13 D
        13 W
        13 W
        13 W
        13 W
        13 W
        13 S
        13 S

21 rows selected.

I have written up to 6 alternatives for this type of query in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html

Regards,
Rob.

回复收藏 0 原文

剑心龙吟 2024-09-07 05:33:28

我为您提供了一个纯 SQL 解决方案。

我改编了我在上发现的技巧一个旧的 Ask Tom 网站，由 Mihail Bratu 发布。我的改编使用正则表达式来标记 VALUE 列，因此它需要 10g 或更高。

测试数据。

SQL> select * from t34
  2  /

        ID VALUE
---------- -------------------------
        10 A,B,C
        11 A,B
        12 A,B,C,D,E,F
        13 R,T,D,W1,W2,W3,W4,W5,S,S

SQL>

查询：

SQL> select   t34.id
  2            , t.column_value value
  3  from t34
  4       , table(cast(multiset(
  5              select regexp_substr (t34.value, '[^(,)]+', 1, level)
  6              from dual
  7              connect by level <= length(value)
  8         ) as sys.dbms_debug_vc2coll )) t
  9  where t.column_value != ','
 10  /

        ID VALUE
---------- -------------------------
        10 A
        10 B
        10 C
        11 A
        11 B
        12 A
        12 B
        12 C
        12 D
        12 E
        12 F
        13 R
        13 T
        13 D
        13 W1
        13 W2
        13 W3
        13 W4
        13 W5
        13 S
        13 S

21 rows selected.

SQL>

I have a pure SQL solution for you.

I adapted a trick I found on an old Ask Tom site, posted by Mihail Bratu. My adaptation uses regex to tokenise the VALUE column, so it requires 10g or higher.

The test data.

SQL> select * from t34
  2  /

        ID VALUE
---------- -------------------------
        10 A,B,C
        11 A,B
        12 A,B,C,D,E,F
        13 R,T,D,W1,W2,W3,W4,W5,S,S

SQL>

The query:

SQL> select   t34.id
  2            , t.column_value value
  3  from t34
  4       , table(cast(multiset(
  5              select regexp_substr (t34.value, '[^(,)]+', 1, level)
  6              from dual
  7              connect by level <= length(value)
  8         ) as sys.dbms_debug_vc2coll )) t
  9  where t.column_value != ','
 10  /

        ID VALUE
---------- -------------------------
        10 A
        10 B
        10 C
        11 A
        11 B
        12 A
        12 B
        12 C
        12 D
        12 E
        12 F
        13 R
        13 T
        13 D
        13 W1
        13 W2
        13 W3
        13 W4
        13 W5
        13 S
        13 S

21 rows selected.

SQL>

回复收藏 0 原文

无戏配角 2024-09-07 05:33:28

根据 Celko 的书，这是我发现的并且效果很好！

  SELECT 
    TABLE1.ID
    , MAX(SEQ1.SEQ) AS START_POS
    , SEQ2.SEQ AS END_POS
    , COUNT(SEQ2.SEQ) AS PLACE
  FROM 
    TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
  WHERE 
    SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
    AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
    AND SEQ1.SEQ < SEQ2.SEQ
    AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
  GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ

其中 V_SEQ 是一个只有一个字段的静态表：

SEQ, integer values 1 through N, where N >= MAX_LENGTH(VALUE).

这是基于 VALUE 两端都用“,”包裹的事实，如下所示：

,A,B,C,D,

如果您的标记是固定长度的（就像我的情况一样），我只需使用 PLACE 字段来计算实际的字符串。如果长度可变，则使用 start_pos 和 end_pos

所以，在我的例子中，标记是 2 个字符长，所以最终的 SQL 是：

SELECT 
    TABLE1.ID
    , SUBSTR(TABLE1.VALUE, T_SUB.PLACE * 3 - 2 , 2 ) AS SINGLE_VAL
FROM
(
  SELECT 
    TABLE1.ID
    , MAX(SEQ1.SEQ) AS START_POS
    , SEQ2.SEQ AS END_POS
    , COUNT(SEQ2.SEQ) AS PLACE
  FROM 
    TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
  WHERE 
    SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
    AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
    AND SEQ1.SEQ < SEQ2.SEQ
    AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
  GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ
) T_SUB
INNER JOIN 
  TABLE1 ON TABLE1.ID = T_SUB.ID
ORDER BY TABLE1.ID, T_SUB.PLACE

Based on Celko's book, here is what I found and it's working well!

  SELECT 
    TABLE1.ID
    , MAX(SEQ1.SEQ) AS START_POS
    , SEQ2.SEQ AS END_POS
    , COUNT(SEQ2.SEQ) AS PLACE
  FROM 
    TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
  WHERE 
    SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
    AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
    AND SEQ1.SEQ < SEQ2.SEQ
    AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
  GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ

Where V_SEQ is a static table with one field:

SEQ, integer values 1 through N, where N >= MAX_LENGTH(VALUE).

This is based on the fact the the VALUE is wrapped by ',' on both ends, like this:

,A,B,C,D,

If your tokens are fixed length (like in my case) I simply used PLACE field to calculate the actual string. If variable length, use start_pos and end_pos

So, in my case, tokens are 2 char long, so the final SQL is:

SELECT 
    TABLE1.ID
    , SUBSTR(TABLE1.VALUE, T_SUB.PLACE * 3 - 2 , 2 ) AS SINGLE_VAL
FROM
(
  SELECT 
    TABLE1.ID
    , MAX(SEQ1.SEQ) AS START_POS
    , SEQ2.SEQ AS END_POS
    , COUNT(SEQ2.SEQ) AS PLACE
  FROM 
    TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
  WHERE 
    SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
    AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
    AND SEQ1.SEQ < SEQ2.SEQ
    AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
  GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ
) T_SUB
INNER JOIN 
  TABLE1 ON TABLE1.ID = T_SUB.ID
ORDER BY TABLE1.ID, T_SUB.PLACE

回复收藏 0 原文