SQL PRESTO-交叉加入Unnest，条纹扁平

发布于 2025-01-21 13:23:16 字数 2553 浏览 1 评论 0原文

为什么我的查询不会像预期的那样使数据变平？

我正在查询一个列包含数组的表。我的目标是不肯定阵列中的物品，以找到独特的物品并将其变成行。

SELECT
table1.tag_names,
table1.tag_ids, 
rank_position
FROM table1
CROSS JOIN UNNEST (tag_ids, tag_names)
   WITH ORDINALITY as T (tag_ids, tag_names, rank_position)
ORDER BY tag_ids

结果：

tag_names	tag_ids	rank_position
[“ red”，“ blue”，“ green”]	[111，222，333]	1
[“红色”，“蓝色”，“ Yellow”]	[111，222，444]	4

所需的结果：

tag_names	tag_ids	rank_position
“ red”	111	1
“蓝色”	222	2
“绿色”	333	3
“黄色”	444	4

我缺少什么？

更新以使用别名，下面的新结果：

tag_names	tag_ids	rank_position
“ red”	111	1
“红色”	111	10
“红色”	111	3
”	111	“ 111 12
12 “黄色”	444	4

原文

Why is my query not flattening the data as expected?

I am querying a table where columns contain arrays. My goal is to unnest the items in the arrays to find unique items and turn those into rows.

SELECT
table1.tag_names,
table1.tag_ids, 
rank_position
FROM table1
CROSS JOIN UNNEST (tag_ids, tag_names)
   WITH ORDINALITY as T (tag_ids, tag_names, rank_position)
ORDER BY tag_ids

Results:

tag_names	tag_ids	rank_position
["red", "blue", "green"]	[111, 222, 333]	1
["red", "blue", "yellow"]	[111, 222, 444]	4

Desired Results:

tag_names	tag_ids	rank_position
"red"	111	1
"blue"	222	2
"green"	333	3
"yellow"	444	4

What am I missing?

Updated to use alias, new results below:

tag_names	tag_ids	rank_position
"red"	111	1
"red"	111	10
"red"	111	3
"red"	111	12
"yellow"	444	4

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你丑哭了我 2025-01-28 13:23:16

您应该在交叉加入Unnest中使用的别名在选择：

-- sample data
WITH dataset (tag_names, tag_ids) AS (
    VALUES (array['red', 'blue', 'green'], array[111, 222, 444])
) 

-- query
select T.tag_names,
   T.tag_ids, 
   rank_position
from dataset
CROSS JOIN UNNEST (tag_ids, tag_names)
   WITH ORDINALITY as T (tag_ids, tag_names, rank_position)
ORDER BY tag_ids

输出：

tag_names	tag_ids	rank_posity
red	111	1
蓝色	222	2
绿色	444	3

upd>

upd> upd> upd> 准则不跨多行起作用，实现所需结果的一种方法是使阵列弄平，然后使用和 row_number ：

-- sample data
WITH dataset (tag_names, tag_ids) AS (
    VALUES (array['red', 'blue', 'green'], array[111, 222, 333]),
     (array['red', 'blue', 'yellow'], array[111, 222, 444])
) 

-- query
select *, row_number() over (order by tag_ids) rank_position
from (
        select T.tag_names,
            T.tag_ids
        from dataset
            CROSS JOIN UNNEST (tag_ids, tag_names) AS T (tag_ids, tag_names)
            GROUP BY T.tag_names, T.tag_ids
    )
ORDER BY tag_ids

output：

tag_names	tag_ids	rank_position
row_number row_number 红色	111	1
蓝色	222	2
绿色	333	3
黄色	444	4

You should use alias introduced for the flattened data in the CROSS JOIN UNNEST in the select:

-- sample data
WITH dataset (tag_names, tag_ids) AS (
    VALUES (array['red', 'blue', 'green'], array[111, 222, 444])
) 

-- query
select T.tag_names,
   T.tag_ids, 
   rank_position
from dataset
CROSS JOIN UNNEST (tag_ids, tag_names)
   WITH ORDINALITY as T (tag_ids, tag_names, rank_position)
ORDER BY tag_ids

Output:

tag_names	tag_ids	rank_position
red	111	1
blue	222	2
green	444	3

UPD

ORDINALITY does not work across multiple rows, one way to achieve desired result is to flatten the arrays, then use group by and row_number:

-- sample data
WITH dataset (tag_names, tag_ids) AS (
    VALUES (array['red', 'blue', 'green'], array[111, 222, 333]),
     (array['red', 'blue', 'yellow'], array[111, 222, 444])
) 

-- query
select *, row_number() over (order by tag_ids) rank_position
from (
        select T.tag_names,
            T.tag_ids
        from dataset
            CROSS JOIN UNNEST (tag_ids, tag_names) AS T (tag_ids, tag_names)
            GROUP BY T.tag_names, T.tag_ids
    )
ORDER BY tag_ids

Output: