在 Snowflake 中使用带有 max 和窗口函数的any_value 时出现无效的分组表达式错误

发布于 2025-01-10 01:06:50 字数 2213 浏览 1 评论 0原文

我收到一个查询,我正在尝试修改它以获得每个 COMP_ID 的最新版本。原始查询:

SELECT 
    ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX, 
    COMP_ID::string AS COMP_ID, 
    ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID, 
    ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION, 
    ANY_VALUE(NAME)::string AS NAME, 
    ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
    MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
    MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
    MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
    MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED
FROM COMP_DATA
GROUP BY COMP_ID
ORDER BY COMP_ID;

然后,我尝试使用窗口函数来仅获取每个 comp_id 的最高版本。 这是修改后的查询:

SELECT 
    ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX, 
    COMP_ID::string AS COMP_ID, 
    ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID, 
    ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION, 
    ANY_VALUE(NAME)::string AS NAME, 
    ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
    MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
    MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
    MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
    MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED,
    ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;

尝试编译时出现以下错误:

SQL 编译错误:[COMP_DATA.COMP_ID] 不是有效的分组依据表达式

我最初认为问题是 COMP_VERSION 上的 ANY_VALUE,但删除 ANY_VALUE 后,出现了相同的错误。我发现不出现错误的唯一方法是删除 4 个 MAX 字段和所有 ANY_VALUE(),如下所示:

SELECT 
    DATA_INDEX::string AS DATA_INDEX, 
    COMP_ID::string AS COMP_ID, 
    ACCOUNT_ID::string AS ACCOUNT_ID, 
    COMP_VERSION::string AS COMP_VERSION, 
    NAME::string AS NAME, 
    DESCRIPTION::string AS DESCRIPTION,
    ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;

当然,这根本不够,因为我需要 4 个最大字段。

我还尝试使用 max 字段创建表,并使用窗口函数从新表中选择每个 COMP_ID 的最高 COMP_VERSION,但给出了相同的错误。

I was given a query and I am attempting to modify it in order to get the most recent version of each COMP_ID. The original query:

SELECT 
    ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX, 
    COMP_ID::string AS COMP_ID, 
    ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID, 
    ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION, 
    ANY_VALUE(NAME)::string AS NAME, 
    ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
    MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
    MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
    MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
    MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED
FROM COMP_DATA
GROUP BY COMP_ID
ORDER BY COMP_ID;

I then attempted to use a window function to grab only the highest version for each comp_id.
This is the modified query:

SELECT 
    ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX, 
    COMP_ID::string AS COMP_ID, 
    ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID, 
    ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION, 
    ANY_VALUE(NAME)::string AS NAME, 
    ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
    MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
    MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
    MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
    MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED,
    ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;

When attempting to compile the below error is given:

SQL compilation error: [COMP_DATA.COMP_ID] is not a valid group by expression

I had originally thought the issue was the ANY_VALUE on COMP_VERSION, but after removing the ANY_VALUE the same error was given. The only way I found to not get an error was removing the 4 MAX fields and all of the ANY_VALUE()'s, as shown below:

SELECT 
    DATA_INDEX::string AS DATA_INDEX, 
    COMP_ID::string AS COMP_ID, 
    ACCOUNT_ID::string AS ACCOUNT_ID, 
    COMP_VERSION::string AS COMP_VERSION, 
    NAME::string AS NAME, 
    DESCRIPTION::string AS DESCRIPTION,
    ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;

Of course this is not at all sufficient since I need the 4 max fields.

I have also tried creating the table with the max fields and from that new table using the window function to select the highest COMP_VERSION of each COMP_ID, but the same error was given.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无所谓啦 2025-01-17 01:06:50

当您添加 QUALIFY 子句时,您从 SQL 中删除了 GROUP BY 子句,像 MAX 这样的聚合函数需要所有选择都是聚合函数或具有 GROUP BY 子句。

因此,如果您只想要每个分组子句的最佳行(您注意到这一点),则需要显式对聚合函数进行窗口化。因此,

SELECT 
    data_index::string AS data_index, 
    comp_id::string AS comp_id, 
    account_id::string AS account_id, 
    comp_version::string AS comp_version, 
    name::string AS name, 
    description::string AS description,
    MAX(object_dict:"startshape-type")OVER(PARTITION BY comp_id)[0]::string  AS startshape_type,
    MAX(object_dict:"startshape-connector-type")OVER (PARTITION BY comp_id)[0]::string AS startshape_connector_type ,
    MAX(object_dict:"startshape-action-type")OVER (PARTITION BY comp_id)[0]::string AS statshape_action_type,
    MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id)[0]::string AS overrides_enabled,
FROM COMP_DATA
QUALIFY 1 = ROW_NUMBER() OVER (PARTITION BY comp_id ORDER BY comp_version DESC);

您很有可能需要在这些 MAX 周围添加一组括号,例如

(MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id))[0]::string AS overrides_enabled,

但我怀疑它可以开箱即用。我假设你不想要 row_number 所以将其推入限定条件(因为它始终是值 1)

When you added your QUALIFY clause you dropped the GROUP BY clause from your SQL, aggregate function like MAX, need all selections to be aggregate function OR to have a GROUP BY clause.

So if you only want the best row per the grouping clause, which you note, you aggregate functions need to be explicitly windowed. Thus

SELECT 
    data_index::string AS data_index, 
    comp_id::string AS comp_id, 
    account_id::string AS account_id, 
    comp_version::string AS comp_version, 
    name::string AS name, 
    description::string AS description,
    MAX(object_dict:"startshape-type")OVER(PARTITION BY comp_id)[0]::string  AS startshape_type,
    MAX(object_dict:"startshape-connector-type")OVER (PARTITION BY comp_id)[0]::string AS startshape_connector_type ,
    MAX(object_dict:"startshape-action-type")OVER (PARTITION BY comp_id)[0]::string AS statshape_action_type,
    MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id)[0]::string AS overrides_enabled,
FROM COMP_DATA
QUALIFY 1 = ROW_NUMBER() OVER (PARTITION BY comp_id ORDER BY comp_version DESC);

There is a small chance you will need to add a set of brackets around those MAX's like

(MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id))[0]::string AS overrides_enabled,

But I suspect it will work out of the box. And I assumed you don't want the row_number so pushed it into the qualify (because it will always be the value 1)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文