将嵌套阵列转换为雪花

发布于 2025-02-08 15:39:57 字数 524 浏览 2 评论 0原文

我有一个表列，上面有雪花数据库中的嵌套数组。我想以雪花SQL中所示的方式将嵌套数组转换为列。

表名称：Sensor_data

Rx列是数据类型变体的。如下所示，嵌套阵列不会总是为3。在某些情况下，有20,000个嵌套阵列，而其他情况则没有。

| ID |             RX              |
|----|-----------------------------|
| 1  |[[0, 15], [1, 50], [2, 34.2]]|
| 2  |[[0, 20], [1, 25]]           |
| 3  |[[0, 100], [1, 42], [2, 76]] |

我想从上表中实现这样的东西：

| ID |Col0 | Col1| Col2|
|----|-----|-----|-----|
| 1  |  15 |  50 | 34.2|
| 2  |  20 |  25 | NULL|
| 3  | 100 |  42 | 76  |

原文

I have a table column with nested arrays in a Snowflake database. I want to convert the nested array into columns in the manner shown below in Snowflake SQL.

Table Name: SENSOR_DATA

The RX column is of data type VARIANT. The nested arrays will not always be 3 as shown below. There are cases where there are 20,000 nested arrays, and other cases where there are none.

| ID |             RX              |
|----|-----------------------------|
| 1  |[[0, 15], [1, 50], [2, 34.2]]|
| 2  |[[0, 20], [1, 25]]           |
| 3  |[[0, 100], [1, 42], [2, 76]] |

I want to achieve something like this from the table above:

| ID |Col0 | Col1| Col2|
|----|-----|-----|-----|
| 1  |  15 |  50 | 34.2|
| 2  |  20 |  25 | NULL|
| 3  | 100 |  42 | 76  |

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

荒人说梦 2025-02-15 15:39:57

让我们重新创建该表：

create table sensor_data as 
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]');

然后获取您拥有的不同列键：

select distinct r.value[0] from sensor_data, table(flatten(input=>rx)) r;

r.value [0]
0
1
2

给定我们希望这是动态的，下一步只是使用静态SQL检查它，它给了我们正确的答案：

 select id
    ,max(iff(r.value[0] = 0, r.value[1], null)) as col_0
    ,max(iff(r.value[0] = 1, r.value[1], null)) as col_1
    ,max(iff(r.value[0] = 2, r.value[1], null)) as col_2
 from sensor_data, table(flatten(input=>rx)) r
 group by 1
 order by 1;

ID	col_0	col_1	col_2
1	15	50	34.2
2	20	25	null
3	100	42	76

立即写入动态执行此操作：

declare
  sql string;
  c1 cursor for select distinct r.value[0] as key from sensor_data, table(flatten(input=>rx)) r;
begin
  sql := 'select id ';
  for record in c1 do
    sql := sql || ',max(iff(r.value[0] = '|| record.key::text ||', r.value[1], null)) as col_' || record.key::text;
  end for;
  sql := sql || ' from sensor_data, table(flatten(input=>rx)) r  group by 1  order by 1';
  return sql;
end;

这给了我们：

选择ID，max（iff（r.value [0] = 0，r.value 1 ，null））为col_0，max（iff（r.value [0] = 1，r.value 1 ，null））为col_1，max（iff（iff（r.value [0] = 2，r.value，r.value） 1 ，null）表（FLATTEN（输入=＆GT; rx））R组按1顺序为1

，当运行给我们预期的结果时。

因此，现在我们想运行：

declare
  sql string;
  res resultset;
  c1 cursor for select distinct r.value[0] as key from sensor_data, table(flatten(input=>rx)) r;
begin
  sql := 'select id ';
  for record in c1 do
    sql := sql || ',max(iff(r.value[0] = '|| record.key::text ||', r.value[1], null)) as col_' || record.key::text;
  end for;
  sql := sql || ' from sensor_data, table(flatten(input=>rx)) r  group by 1  order by 1';
  
  res := (execute immediate :sql);
  return table (res);
end;

给予：

ID	COL_0	COL_1	COL_2
1	15	50	34.2
2	20	25	NULL
3	100	42	76

基于手册的这些部分的代码：

与loops一起工作

雪花扫描/结果表.html“ rel =“ nofollow noreferrer”>与结果集一起使用

和上述顺序：

create or replace table sensor_data as 
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]')
             ,(4, '[[0,20],[30,50], [45, 100]]');

declare
  sql string;
  res resultset;
  c1 cursor for select distinct r.value[0] as key from sensor_data, table(flatten(input=>rx)) r order by key;
begin
  sql := 'select id ';
  for record in c1 do
    sql := sql || ',max(iff(r.value[0] = '|| record.key::text ||', r.value[1], null)) as col_' || record.key::text;
  end for;
  sql := sql || ' from sensor_data, table(flatten(input=>rx)) r  group by 1  order by 1';
  
  res := (execute immediate :sql);
  return table (res);
end;

giss：

id	col_0	col_0	col_1	col_2	col_3
带有	”	额外	和提及的顺序	的	棘手
“	数据	25	null	null	null
3	100	42	76	null	null
4	20	null	null	50	100

，所以是的，但这是最大的操作：

因此，如果我们返回原始数据

create or replace table sensor_data as 
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]');

并更改代码以不具有最大和组，

 select id
    ,iff(r.value[0] = 0, r.value[1], null) as col_0
    ,iff(r.value[0] = 1, r.value[1], null) as col_1
    ,iff(r.value[0] = 2, r.value[1], null) as col_2
 from sensor_data, table(flatten(input=>rx)) r
 order by 1;

我们会看到：

ID	col_0	col_1	col_2
1	15	null	null
1	1 null	50	null
null 1	null	null	34.2
2	20	null	null
2	null	2 null 2 null 2 null 25	null
3	100	null	null
3	null 3 null	3 null 42	null
3	null 3 null	null	76，

以便我们可以看到这些值被插入了，现在由ID进行了，现在是ID的组，将要滚动这三个ID，因此对于Col0，我们具有15，null，Null max采用最高的值，而15较高，而null则是null的。这是正在进行的过程。

Lets recreate that table:

create table sensor_data as 
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]');

Then get the distinct column keys that you have:

select distinct r.value[0] from sensor_data, table(flatten(input=>rx)) r;

R.VALUE[0]
0
1
2

Given we want this to be dynamic, the next step is just to check with static SQL it gives us the correct answer:

 select id
    ,max(iff(r.value[0] = 0, r.value[1], null)) as col_0
    ,max(iff(r.value[0] = 1, r.value[1], null)) as col_1
    ,max(iff(r.value[0] = 2, r.value[1], null)) as col_2
 from sensor_data, table(flatten(input=>rx)) r
 group by 1
 order by 1;

ID	COL_0	COL_1	COL_2
1	15	50	34.2
2	20	25	null
3	100	42	76

write now to do this dynamically:

declare
  sql string;
  c1 cursor for select distinct r.value[0] as key from sensor_data, table(flatten(input=>rx)) r;
begin
  sql := 'select id ';
  for record in c1 do
    sql := sql || ',max(iff(r.value[0] = '|| record.key::text ||', r.value[1], null)) as col_' || record.key::text;
  end for;
  sql := sql || ' from sensor_data, table(flatten(input=>rx)) r  group by 1  order by 1';
  return sql;
end;

which gives us:

select id ,max(iff(r.value[0] = 0, r.value1, null)) as col_0,max(iff(r.value[0] = 1, r.value1, null)) as col_1,max(iff(r.value[0] = 2, r.value1, null)) as col_2 from sensor_data, table(flatten(input=>rx)) r group by 1 order by 1

which when run gives us the expected results.

so now we want to run that:

declare
  sql string;
  res resultset;
  c1 cursor for select distinct r.value[0] as key from sensor_data, table(flatten(input=>rx)) r;
begin
  sql := 'select id ';
  for record in c1 do
    sql := sql || ',max(iff(r.value[0] = '|| record.key::text ||', r.value[1], null)) as col_' || record.key::text;
  end for;
  sql := sql || ' from sensor_data, table(flatten(input=>rx)) r  group by 1  order by 1';
  
  res := (execute immediate :sql);
  return table (res);
end;

gives:

ID	COL_0	COL_1	COL_2
1	15	50	34.2
2	20	25	null
3	100	42	76

based of code from these sections of the manual:

Working with loops

Working with Resultsets

With extra "tricky data" and the mentioned ORDER BY:

create or replace table sensor_data as 
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]')
             ,(4, '[[0,20],[30,50], [45, 100]]');

declare
  sql string;
  res resultset;
  c1 cursor for select distinct r.value[0] as key from sensor_data, table(flatten(input=>rx)) r order by key;
begin
  sql := 'select id ';
  for record in c1 do
    sql := sql || ',max(iff(r.value[0] = '|| record.key::text ||', r.value[1], null)) as col_' || record.key::text;
  end for;
  sql := sql || ' from sensor_data, table(flatten(input=>rx)) r  group by 1  order by 1';
  
  res := (execute immediate :sql);
  return table (res);
end;

gives:

ID	COL_0	COL_1	COL_2	COL_30	COL_45
1	15	50	34.2	null	null
2	20	25	null	null	null
3	100	42	76	null	null
4	20	null	null	50	100

So yes, but that's the MAX doing:

so if we go back in to original data

create or replace table sensor_data as 
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]');

and alter the code to not have the MAX and the GROUP BY

 select id
    ,iff(r.value[0] = 0, r.value[1], null) as col_0
    ,iff(r.value[0] = 1, r.value[1], null) as col_1
    ,iff(r.value[0] = 2, r.value[1], null) as col_2
 from sensor_data, table(flatten(input=>rx)) r
 order by 1;

we see:

ID	COL_0	COL_1	COL_2
1	15	null	null
1	null	50	null
1	null	null	34.2
2	20	null	null
2	null	25	null
3	100	null	null
3	null	42	null
3	null	null	76

so we can see those values getting unrolled, now the GROUP BY id, is going to roll those three id's up, thus for col0 we have 15, null, null MAX takes the higest value, and 15 is higher that null so that is what is keep. This is the process this is working on.

回复收藏 0 原文

柠檬色的秋千 2025-02-15 15:39:57

使用[]访问数组元素：

SELECT ID, RX[0][1] AS col1,  RX[1][1] AS col1, RX[2][1] AS col2
FROM SENSOR_DATA;

Using [] to access array elements:

SELECT ID, RX[0][1] AS col1,  RX[1][1] AS col1, RX[2][1] AS col2
FROM SENSOR_DATA;

回复收藏 0 原文

┊风居住的梦幻卍 2025-02-15 15:39:57

您的要求不完全是什么，但这很接近

with sensor_data as (
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]')
       as vals
),
flat as (
select id, val.value[1] arrvalue
  from sensor_data,
  lateral flatten(input => sensor_data.rx, outer => true) val
 )
select
     id
    ,listagg(arrvalue, ',') rx_list
from flat
group by id
order by id
  ;

ID  RX_LIST
1   15,50,34.2
2   20,25
3   100,42,76

Not exactly what you asked for but this is close

with sensor_data as (
  select column1 id, parse_json(column2) rx
  from values (1, '[[0, 15], [1, 50], [2, 34.2]]')
             ,(2, '[[0, 20], [1, 25]]')
             ,(3, '[[0, 100], [1, 42], [2, 76]]')
       as vals
),
flat as (
select id, val.value[1] arrvalue
  from sensor_data,
  lateral flatten(input => sensor_data.rx, outer => true) val
 )
select
     id
    ,listagg(arrvalue, ',') rx_list
from flat
group by id
order by id
  ;

ID  RX_LIST
1   15,50,34.2
2   20,25
3   100,42,76

回复收藏 0 原文

~没有更多了~

关于作者

心碎无痕…

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

将嵌套阵列转换为雪花

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

和上述顺序：

，所以是的，但这是最大的操作：

With extra "tricky data" and the mentioned ORDER BY:

So yes, but that's the MAX doing:

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

将嵌套阵列转换为雪花

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

和上述顺序：

，所以是的，但这是最大的操作：

With extra "tricky data" and the mentioned ORDER BY:

So yes, but that's the MAX doing:

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。