雪花sql，嵌套在变体的第一级的阵列的平坦路径参数

发布于 2025-01-22 13:09:27 字数 1411 浏览 4 评论 0原文

根据以下内容，有一个由存储对象（又称字典，键值对）组成的变体字段“事件”，如下所示：

[
{ "field_name": "status",  "id": 987418431597, "previous_value": "new", "type": "Change", "value": "pending"},
{ "field_name": "360020024138", "id": 987418431617, "previous_value": null, "type": "Change", "value": "#55927" },
{ "field_name": "360016698218", "id": 987418431637, "previous_value": null, "type": "Change", "value": "0681102386"},
{ "field_name": "360016774537", "id": 987418431657, "previous_value": null, "type": "Change", "value": "89367031562011632212"}
]

此字段属于事件日志，我试图使用“事件“作为获得相关时间戳和ID的过滤器。

通过 flatten上的雪花文档我发现<我发现<代码>递归=＆gt; true 参数使我能够将变体一直延伸到其嵌套对象，但是希望通过优化代码，我想使用path参数，以选择性地展开“事件” 但是，对于我感兴趣的对象。

但是，由于某种原因，Flatten不允许我通过数字路径来识别我要扩展的对象的数组索引，例如：

select b.* 
from "event_log" a
    ,lateral flatten (input => a."events", path => 0) b limit 100;

返回：无效类型，0）]对于参数'路径'

和

select b.* 
from "event_log" a
   , lateral flatten (input => a."events", path => [0]) b limit 100;

返回：语法错误：意外'['。（第162行）

具有讽刺意味的是，使用recursive =＆gt; true，B.Path字段表示[i]之类的索引。

Sflake Docs中的示例是使用存储数组的对象使用路径参数的一个示例变体。

原文

There is a Variant field "events" that is made up of an Array storing Objects(aka dictionaries, key-value pairs), as per the below:

[
{ "field_name": "status",  "id": 987418431597, "previous_value": "new", "type": "Change", "value": "pending"},
{ "field_name": "360020024138", "id": 987418431617, "previous_value": null, "type": "Change", "value": "#55927" },
{ "field_name": "360016698218", "id": 987418431637, "previous_value": null, "type": "Change", "value": "0681102386"},
{ "field_name": "360016774537", "id": 987418431657, "previous_value": null, "type": "Change", "value": "89367031562011632212"}
]

This field belongs to an event log, and I am trying to use the content of "events" as a filter to get the related timestamps and ids.

Through the Snowflake documentation on Flatten I found out that the recursive => True parameter allows me to expand the Variant all the way down to its nested objects, but with hopes of optimising code, I wanted to use the path parameter, to selectively expand "events" only for the Objects I was interested in.

However, for some reason, Flatten does not allow me to pass a numeric path to identify the Array index of Object that I want to expand, as:

select b.* 
from "event_log" a
    ,lateral flatten (input => a."events", path => 0) b limit 100;

returns: invalid type [NUMBER(1,0)] for parameter 'path'

and

select b.* 
from "event_log" a
   , lateral flatten (input => a."events", path => [0]) b limit 100;

returns: Syntax error: unexpected '['. (line 162)

Ironically, when using recursive => True, the b.path field represent indexes like this [i].

The example in the SFlake docs makes an example of the use of the path parameter with an Object that stores Arrays, whereas here "events" is made up of an Array of Objects, so I actually do not have any working example for this type of Variants.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

﹏半生如梦愿梦如真 2025-01-29 13:09:27

阵列索引应作为Input：

select b.* 
from "event_log" a,
lateral flatten (input => a."events"[1]) b
limit 100;

示例

CREATE OR REPLACE TABLE "event_log"("events" VARIANT)
AS
SELECT '[   {     "field_name": "status",     "id": 987418431597,     "previous_value": "new",     "type": "Change",     "value": "pending"   },   {     "field_name": "360020024138",     "id": 987418431617,     "previous_value": null,     "type": "Change",     "value": "#55927"   },   {     "field_name": "360016698218",     "id": 987418431637,     "previous_value": null,     "type": "Change",     "value": "0681102386"   },   {     "field_name": "360016774537",     "id": 987418431657,     "previous_value": null,     "type": "Change",     "value": "89367031562011632212"   } ]';

：

The array index should be provided as input:

select b.* 
from "event_log" a,
lateral flatten (input => a."events"[1]) b
limit 100;

Sample:

CREATE OR REPLACE TABLE "event_log"("events" VARIANT)
AS
SELECT '[   {     "field_name": "status",     "id": 987418431597,     "previous_value": "new",     "type": "Change",     "value": "pending"   },   {     "field_name": "360020024138",     "id": 987418431617,     "previous_value": null,     "type": "Change",     "value": "#55927"   },   {     "field_name": "360016698218",     "id": 987418431637,     "previous_value": null,     "type": "Change",     "value": "0681102386"   },   {     "field_name": "360016774537",     "id": 987418431657,     "previous_value": null,     "type": "Change",     "value": "89367031562011632212"   } ]';

Output:

回复收藏 0 原文

甩你一脸翔 2025-01-29 13:09:27

因此，每行都有一个数组。该数组具有许多具有相同“结构”的对象。获得ID是有道理的，因为每个对象都有一个。

因此，您可以访问它。

select a."events"[0]:id::number as id
from "event_log" as a
limit 100;

给出：

A。“事件” [0]：ID
987418431597

，但鉴于每个对象的ID看起来与众不同，您知道自己要纠正ID的情况似乎更有意义地使用Flatter来展开数组。并访问对象元素。

2行”中

with "event_log"("events") as (
    select parse_json(column1) from values
    ('[
        { "field_name": "status",  "id": 0987418431597, "previous_value": "new", "type": "Change", "value": "pending"},
        { "field_name": "360020024138", "id": 0987418431617, "previous_value": null, "type": "Change", "value": "#55927" },
        { "field_name": "360016698218", "id": 0987418431637, "previous_value": null, "type": "Change", "value": "0681102386"},
        { "field_name": "360016774537", "id":0987418431657, "previous_value": null, "type": "Change", "value": "89367031562011632212"}
        ]'),
    ('[
        { "field_name": "status",  "id": 1987418431597, "previous_value": "new", "type": "Change", "value": "pending"},
        { "field_name": "360020024138", "id":1987418431617, "previous_value": null, "type": "Change", "value": "#55927" },
        { "field_name": "360016698218", "id": 1987418431637, "previous_value": null, "type": "Change", "value": "0681102386"},
        { "field_name": "360016774537", "id":1987418431657, "previous_value": null, "type": "Change", "value": "89367031562011632212"}
        ]')
)
select b.seq as input_row
    ,b.index as array_index
    ,b.value:field_name::text as field_name
    ,b.value:id::number as id
    ,b.value:previous_value::text as previous_value
    ,b.value:type::text as type
    ,b.value:value::text as value
from "event_log" as a
    ,lateral flatten(input=>a."events") b
;

：

input_row	array_index	field_name	id	revory_value	类型	2
1	0	状态	987418431597	新	更改	未决
1	1	360020024138	987418418431617		0	状态
：	更改	得到	0016774537		此	我们
，	“	，	987418431657		89367031562011632212	因此
获取	在	我们	1987418431597	新	变化	悬而未决
2	1	360020024138	1987418431617		变更	55927
2	2	360016698218			的	＃

So you have an array in each row. The array has many objects with the same "structure". Getting the ID makes sense as there is one in every object.

So you could just access it.

select a."events"[0]:id::number as id
from "event_log" as a
limit 100;

gives:

A."EVENTS"[0]:ID
987418431597

But given each object's ID looks to be different how do you know you are getting to correct ID's it would seem to make more sense to use flatten to unroll the array. and access the object elements.

so in this "2 rows" of 4 fields:

with "event_log"("events") as (
    select parse_json(column1) from values
    ('[
        { "field_name": "status",  "id": 0987418431597, "previous_value": "new", "type": "Change", "value": "pending"},
        { "field_name": "360020024138", "id": 0987418431617, "previous_value": null, "type": "Change", "value": "#55927" },
        { "field_name": "360016698218", "id": 0987418431637, "previous_value": null, "type": "Change", "value": "0681102386"},
        { "field_name": "360016774537", "id":0987418431657, "previous_value": null, "type": "Change", "value": "89367031562011632212"}
        ]'),
    ('[
        { "field_name": "status",  "id": 1987418431597, "previous_value": "new", "type": "Change", "value": "pending"},
        { "field_name": "360020024138", "id":1987418431617, "previous_value": null, "type": "Change", "value": "#55927" },
        { "field_name": "360016698218", "id": 1987418431637, "previous_value": null, "type": "Change", "value": "0681102386"},
        { "field_name": "360016774537", "id":1987418431657, "previous_value": null, "type": "Change", "value": "89367031562011632212"}
        ]')
)
select b.seq as input_row
    ,b.index as array_index
    ,b.value:field_name::text as field_name
    ,b.value:id::number as id
    ,b.value:previous_value::text as previous_value
    ,b.value:type::text as type
    ,b.value:value::text as value
from "event_log" as a
    ,lateral flatten(input=>a."events") b
;

we get:

INPUT_ROW	ARRAY_INDEX	FIELD_NAME	ID	PREVIOUS_VALUE	TYPE	VALUE
1	0	status	987418431597	new	Change	pending
1	1	360020024138	987418431617		Change	#55927
1	2	360016698218	987418431637		Change	0681102386
1	3	360016774537	987418431657		Change	89367031562011632212
2	0	status	1987418431597	new	Change	pending
2	1	360020024138	1987418431617		Change	#55927
2	2	360016698218	1987418431637		Change	0681102386
2	3	360016774537	1987418431657		Change	89367031562011632212