使用DBT中使用Star Macro获取列的名称和类型

发布于 2025-02-09 21:47:29 字数 856 浏览 1 评论 0 原文

使用Star宏,除了列名称外,还有一种方法可以获取列数据类型(布尔值,数值等)吗?

例如,此查询使用Star宏从参考表收集列名,将其保存为数组变量 column_names ,然后我循环循环此数组,并将最大函数应用于所有列函数。

{% set column_names = star(
    from=ref_table,
    except=["a", "b", "c"],
    as_list=True)
%}

select 
    date_trunc('week', day) as week,
    name,

    {%- for col in column_names %}  
    max({{ col|lower }}) as {{ col | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {%- endfor %}

from {{ ref('my_table_name') }}    
group by 1, 2

我想有条件地将最大函数应用于布尔列。

这可能看起来像是类似的

{%- for col in column_names %}  
    {% if is_boolean(col) %}  
    max({{ col|lower }}) as {{ col | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {% endif %}
{%- endfor %}

,但是问题是星形宏将列名作为字符串传递,因此它不会随身携带任何元数据。

我如何在此处获取列数据类型?

数据仓库:雪花

Using the star macro, is there a way to also get the column data type (boolean, numerical, etc), in addition to the column name?

For example, this query uses the star macro to collect the column names from a reference table, saves it as an array variable column_names, and then I loop over this array and apply the max function to all the columns.

{% set column_names = star(
    from=ref_table,
    except=["a", "b", "c"],
    as_list=True)
%}

select 
    date_trunc('week', day) as week,
    name,

    {%- for col in column_names %}  
    max({{ col|lower }}) as {{ col | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {%- endfor %}

from {{ ref('my_table_name') }}    
group by 1, 2

I would like to conditionally apply the max function to only boolean columns.

This might look something like

{%- for col in column_names %}  
    {% if is_boolean(col) %}  
    max({{ col|lower }}) as {{ col | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {% endif %}
{%- endfor %}

but the problem is that the star macro passes the column names as a string, so it's not carrying any metadata with it.

How might I get the column data type here?

Data warehouse: Snowflake

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

他夏了夏天 2025-02-16 21:47:29

您可以查看 dbt_utils.star的源=“ noreferrer”>此处

在引擎盖下,它使用 dbt_utils.get_filtered_columns_in_relation 。该宏还返回列名。然而!该宏使用内置,返回 data_type 属性。

因此,您的代码变为:

{% set all_columns = adapter.get_columns_in_relation(
    ref("my_table")
) %}
{% set except_col_names=["a", "b", "c"] %}

select 
    date_trunc('week', day) as week,
    name,

    {%- for col in all_columns if col.name not in except_col_names %}  
    {% if col.data_type == 'BOOLEAN' %}  
    max({{ col.name|lower }}) as {{ col.name|lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {% endif %}
    {%- endfor %}

from {{ ref('my_table_name') }}    
group by 1, 2

You can view the source for dbt_utils.star here

Under the hood, it uses dbt_utils.get_filtered_columns_in_relation. That macro also just returns column names. However! that macro uses the built-in adapter.get_columns_in_relation, which returns a list of Column objects, which have a data_type property.

So your code becomes:

{% set all_columns = adapter.get_columns_in_relation(
    ref("my_table")
) %}
{% set except_col_names=["a", "b", "c"] %}

select 
    date_trunc('week', day) as week,
    name,

    {%- for col in all_columns if col.name not in except_col_names %}  
    {% if col.data_type == 'BOOLEAN' %}  
    max({{ col.name|lower }}) as {{ col.name|lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {% endif %}
    {%- endfor %}

from {{ ref('my_table_name') }}    
group by 1, 2
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文