在unique_key中使用多个列用于DBT中的增量加载
对于增量模型,dbt文档说:
您的模型定义中应提供unique_key,作为表示一个简单列的字符串或可以一起使用的单个引用列名的列表,例如['col1','col2',…])
我已经建立了一个DBT中的增量模型具有此增量定义,
{{
config(
materialized='incremental',
unique_key = ['Col1', 'Col2', 'Col3']
)
}}
该定义将雪花中的此合并语句编译到雪花上的合并语句中:
using DW_DEV.dbt_dgarrison_DATA_STAGING.MY_TABLE__dbt_tmp as DBT_INTERNAL_SOURCE
on
DBT_INTERNAL_SOURCE.['Col1', 'Col2', 'Col3'] = DBT_INTERNAL_DEST.['Col1', 'Col2', 'Col3']
...
这会合理地抛出一个SQL错误,抱怨括号:
SQL汇编错误:语法错误行4处位于位置32意外'['。语法错误行4在位置45意外','。语法错误行4处的位置98意外'['。语法错误行4在位置111意外','。
我找不到其他好的示例使用多列。 (有一些涉及串联列的选项,我对最佳方法的建议开放,但我正在尝试弄清楚如何使用DBT推荐的语法)
For incremental models, the DBT documentation here says:
The unique_key should be supplied in your model definition as a string representing a simple column or a list of single quoted column names that can be used together, for example, ['col1', 'col2', …])
I've built an incremental model in DBT with this incremental definition
{{
config(
materialized='incremental',
unique_key = ['Col1', 'Col2', 'Col3']
)
}}
Which compiles into this merge statement in in Snowflake:
using DW_DEV.dbt_dgarrison_DATA_STAGING.MY_TABLE__dbt_tmp as DBT_INTERNAL_SOURCE
on
DBT_INTERNAL_SOURCE.['Col1', 'Col2', 'Col3'] = DBT_INTERNAL_DEST.['Col1', 'Col2', 'Col3']
...
And this reasonably throws a SQL ERROR complaining about the brackets:
SQL compilation error: syntax error line 4 at position 32 unexpected '['. syntax error line 4 at position 45 unexpected ','. syntax error line 4 at position 98 unexpected '['. syntax error line 4 at position 111 unexpected ','.
I can't find any other good examples using multiple columns this way. (there are options involving concatenating columns, and I'm open to recommendations on the best approach to that, but I'm trying to figure out how to use the DBT recommended syntax)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
作为 dbt-core 1.1.0 ,我们现在可以将列表传递给indore_key语句以增量模型。请参阅原始问题在这里。
这意味着您应该能够通过更新
dbt-core
和您的dbt-< apapter>
在本地实现目标;或相应地将DBT云版本更新为1.1.0,因为给定的错误,它看起来像unique_key
仍在寻找单个字符串而不是数组。As part of dbt-core 1.1.0, we can now pass a list to the unique_key statement in incremental models. See the original issue here.
This means that you should be able to achieve your goal by updating
dbt-core
and yourdbt-<adapter>
version locally; or updating your dbt Cloud version accordingly, to 1.1.0, since given the error you get, it looks likeunique_key
is still looking for a single string instead of an array.如果有任何帮助,DBT在他们的文档中提到了这一点。只需添加一个替代键,然后在快照配置中使用它来实现这一目标。
参考: https://docs.getdbt.com/reference/Resource-configs/unique_key.use-use-a-combination-of-two-columns-as-a-a-a-nique-key
If it's of any help, dbt mentioned this in their docs. Simply add a surrogate key and use it in the snapshot configuration to accomplish this.
Ref: https://docs.getdbt.com/reference/resource-configs/unique_key#use-a-combination-of-two-columns-as-a-unique-key