当前位置：文江博客话题详情

运行时错误：无法在Spark中设置数据库！ [dbt＆＃x2B;火花＆＃x2B;节约]

发布于 2025-02-06 17:07:23 字数 1558 浏览 1 评论 0原文

谁能帮我吗？我会遇到错误，***运行时错误：在Spark！***通过远程Hive Metastore运行DBT模型时无法设置数据库。

我需要使用Apache Spark作为适配器来改变DBT中的一些模型。现在，我在本地机器上本地运行Spark。我从远程蜂巢Metastore URI开始了下面的旧服务服务器。

启动主人

./ sbin/start-master.sh

启动工人

./ sbin/start-worker.sh spark：// master_url：7077

启动thrift服务器

./ sbin/start-thriftserver.sh--主机spark：// master_url：7077 -packages org.apache.iceberg：Iceberg-Spark3-runtime：0.13.1 -hiveConf hive.metastore.uris = thrift = thrift：// ip：9083

我的DBT项目，

  project_name：
  输出：
    开发人员：
      主持人：Localhost
      方法：节俭
      端口：10000
      架构：test_dbt
      线程：4
      类型：火花
      用户：管理员
  目标：开发

执行DBT运行时，获取以下错误。

dbt run --select test -t dev
Running with dbt=1.1.0
Partial parse save file not found. Starting full parse.
Encountered an error:
Runtime Error 
Cannot set database in spark!

没有太多信息

请注意，dbt.log 解决方案

由于源YML文件中提交的“数据库”，因此。

总是模式，从不 Apache Spark可以互换使用术语“架构”和“数据库”。 DBT了解数据库的存在比模式更高。因此，在运行DBT Spark时，切勿将或将数据库设置为节点配置或目标配置文件。 https://docs.getdbt.com/参考/资源configs/spark-configs＃始终schema-never-database

原文

Can anyone help me on this?
I'm getting error,***Runtime Error: Cannot set database in spark!*** while running dbt model via Spark thrift mode with remote Hive metastore.

I need to transform some models in DBT using Apache Spark as the adapter. Now, I'm running spark locally on my local machine.
I started the thrift server as below with remote hive metastore URI.

Started master

./sbin/start-master.sh

Started worker

./sbin/start-worker.sh spark://master_url:7077

Started Thrift Server

./sbin/start-thriftserver.sh --master spark://master_url:7077
--packages org.apache.iceberg:iceberg-spark3-runtime:0.13.1 --hiveconf hive.metastore.uris=thrift://ip:9083

In my DBT project,

project_name:
  outputs:
    dev:
      host: localhost
      method: thrift
      port: 10000
      schema: test_dbt
      threads: 4
      type: spark
      user: admin
  target: dev

While executing dbt run,
getting the following error.

dbt run --select test -t dev
Running with dbt=1.1.0
Partial parse save file not found. Starting full parse.
Encountered an error:
Runtime Error 
Cannot set database in spark!

Please note that there is not much info in dbt.log

SOLUTION

This error was getting because of the " database" filed in the source yml file.

Always schema, never database
Apache Spark uses the terms "schema" and "database" interchangeably. dbt understands database to exist at a higher level than schema. As such, you should never use or set database as a node config or in the target profile when running dbt-spark.
https://docs.getdbt.com/reference/resource-configs/spark-configs#always-schema-never-database

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鱼忆七猫命九 2025-02-13 17:07:24

架构 test_dbt 在蜂巢中不存在
我认为您需要在Hive

step1 中创建 test_dbt 数据库。登录以激发群集并停止旧服务服务器并运行Spark-SQL

step2 。创建数据库test_dbt

step3 。重新启动旧服务服务器

，或者

您可以使用默认架构如下


dbt_spark_project:
  outputs:
    dev:
      host: spark-cluster
      method: thrift
      port: 10000
      schema: default
      threads: 4
      type: spark
  target: dev

schema test_dbt not exist in the hive
I think you need to create test_dbt database in Hive

step1. log in to spark cluster and stop thrift server and run spark-sql

step2. create database test_dbt

step3. restart thrift server

you can use default schema like below


dbt_spark_project:
  outputs:
    dev:
      host: spark-cluster
      method: thrift
      port: 10000
      schema: default
      threads: 4
      type: spark
  target: dev

回复收藏 0 原文

~没有更多了~