在 Snowflake 中使用带有 max 和窗口函数的any_value 时出现无效的分组表达式错误
我收到一个查询,我正在尝试修改它以获得每个 COMP_ID 的最新版本。原始查询:
SELECT
ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID,
ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION,
ANY_VALUE(NAME)::string AS NAME,
ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED
FROM COMP_DATA
GROUP BY COMP_ID
ORDER BY COMP_ID;
然后,我尝试使用窗口函数来仅获取每个 comp_id 的最高版本。 这是修改后的查询:
SELECT
ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID,
ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION,
ANY_VALUE(NAME)::string AS NAME,
ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED,
ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;
尝试编译时出现以下错误:
SQL 编译错误:[COMP_DATA.COMP_ID] 不是有效的分组依据表达式
我最初认为问题是 COMP_VERSION 上的 ANY_VALUE,但删除 ANY_VALUE 后,出现了相同的错误。我发现不出现错误的唯一方法是删除 4 个 MAX 字段和所有 ANY_VALUE(),如下所示:
SELECT
DATA_INDEX::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ACCOUNT_ID::string AS ACCOUNT_ID,
COMP_VERSION::string AS COMP_VERSION,
NAME::string AS NAME,
DESCRIPTION::string AS DESCRIPTION,
ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;
当然,这根本不够,因为我需要 4 个最大字段。
我还尝试使用 max 字段创建表,并使用窗口函数从新表中选择每个 COMP_ID 的最高 COMP_VERSION,但给出了相同的错误。
I was given a query and I am attempting to modify it in order to get the most recent version of each COMP_ID. The original query:
SELECT
ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID,
ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION,
ANY_VALUE(NAME)::string AS NAME,
ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED
FROM COMP_DATA
GROUP BY COMP_ID
ORDER BY COMP_ID;
I then attempted to use a window function to grab only the highest version for each comp_id.
This is the modified query:
SELECT
ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID,
ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION,
ANY_VALUE(NAME)::string AS NAME,
ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED,
ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;
When attempting to compile the below error is given:
SQL compilation error: [COMP_DATA.COMP_ID] is not a valid group by expression
I had originally thought the issue was the ANY_VALUE on COMP_VERSION, but after removing the ANY_VALUE the same error was given. The only way I found to not get an error was removing the 4 MAX fields and all of the ANY_VALUE()'s, as shown below:
SELECT
DATA_INDEX::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ACCOUNT_ID::string AS ACCOUNT_ID,
COMP_VERSION::string AS COMP_VERSION,
NAME::string AS NAME,
DESCRIPTION::string AS DESCRIPTION,
ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;
Of course this is not at all sufficient since I need the 4 max fields.
I have also tried creating the table with the max fields and from that new table using the window function to select the highest COMP_VERSION of each COMP_ID, but the same error was given.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当您添加 QUALIFY 子句时,您从 SQL 中删除了 GROUP BY 子句,像 MAX 这样的聚合函数需要所有选择都是聚合函数或具有 GROUP BY 子句。
因此,如果您只想要每个分组子句的最佳行(您注意到这一点),则需要显式对聚合函数进行窗口化。因此,
您很有可能需要在这些 MAX 周围添加一组括号,例如
但我怀疑它可以开箱即用。我假设你不想要 row_number 所以将其推入限定条件(因为它始终是值 1)
When you added your QUALIFY clause you dropped the GROUP BY clause from your SQL, aggregate function like MAX, need all selections to be aggregate function OR to have a GROUP BY clause.
So if you only want the best row per the grouping clause, which you note, you aggregate functions need to be explicitly windowed. Thus
There is a small chance you will need to add a set of brackets around those MAX's like
But I suspect it will work out of the box. And I assumed you don't want the row_number so pushed it into the qualify (because it will always be the value 1)