在创建表中为PubSub源指定的无支撑架构

发布于 2025-02-13 20:10:59 字数 617 浏览 2 评论 0原文

跟随 link> link 我在Google,在Google,我正在尝试进行示例设置,以在PubSub中发布消息,并使用DataFlow SQL将其加载到BigQuery表中。

但是,当我创建数据流的作业时,AR会低于错误:

无效/无支持的SQL作业启动论点:无效表 数据目录中的规格:为Pu​​bSub指定的未支撑架构 create table中的来源。pubsub主题的创建表必须包括 “ event_timestamp”类型“ Timestamp”字段'

请帮助我解决此问题并澄清我的以下疑问:

  1. 是否必须将Event_timestamp字段保存在PubSub架构/dataflow SQL/BigQuery Table中?
  2. 当我使用Schema创建PubSub时在DataFlow SQL中 显示了架构。
  3. 在搜索主题名称时,

Following the link I found in Google, I'm trying to do a sample setup to publish message in pubsub and load the same into bigquery table using dataflow sql.

But when I create dataflow job am getting below error:

Invalid/unsupported arguments for SQL job launch: Invalid table
specification in Data Catalog: Unsupported schema specified for Pubsub
source in CREATE TABLE.CREATE TABLE for Pubsub topic must include at
least 'event_timestamp' field of type 'TIMESTAMP'"

Kindly help me to fix this and clarify my below doubts:

  1. Is it mandatory to keep event_timestamp field in pubsub schema/dataflow sql/bigquery table?
  2. When I create pubsub topic with schema it didnt reflect in dataflow sql whereas when I assign it manually from cloud shell using gcloud data-catalog entries update it reflects in dataflow sql when searching the topic name it showed the schema. So which is the right method to assign schema to pubsub topic
  3. Data catalog also not showing the schema assigned to the pubsub topic.

Let me know if anymore details are required.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

執念 2025-02-20 20:11:00

我能够遵循文档,并取得了成功的结果。对您的问题的回答:

1。是否必须将Event_timestamp字段保存在pubsub架构/dataflow sql/bigquery表中?

是的,是的,是的,在这种情况下是必要的,因为在以下查询中,它使用

 SELECT
   sr.sales_region,
   TUMBLE_START("INTERVAL 15 SECOND") AS period_start,
   SUM(tr.amount) as amount
 FROM pubsub.topic.`project-id`.transactions AS tr
   INNER JOIN bigquery.table.`project-id`.dataflow_sql_tutorial.us_state_salesregions AS sr
   ON tr.state = sr.state_code
 GROUP BY
   sr.sales_region,
   TUMBLE(tr.event_timestamp, "INTERVAL 15 SECOND")

2。当我使用架构创建PubSub主题时,它并未反映在数据流SQL中,而当我使用gcloud data-catalog条目从云外壳中手动分配它时,它在搜索主题名称时会在数据流SQL中反映出来。因此,这是将架构分配到pubsub主题的正确方法

您可以使用控制台/gcloud分配架构。但是,当使用Console/gcloud命令时,它们会受到以下限制

以下是有关使用模式的一些准则:

  • 您不能将模式添加到现有主题中。
  • 您只能在创建主题时才指定模式。
  • 架构与主题关联后,您无法更新模式或删除其与该主题的关联。
  • 您可以将相同的模式应用于其他新主题。

您可以在更新现有架构时使用gcloud Data-catalog条目更新

3。数据目录还没有显示分配给PubSub主题的模式。

您可以使用 gcloud data-catalog条目查找 ,让我知道。

I was able to follow the documentation and it yielded successful results. Response to your questions:

1. Is it mandatory to keep event_timestamp field in pubsub schema/dataflow sql/bigquery table?

Yes it is necessary in this scenario since in the following query, it uses TUMBLE function and event_timestamp column is a DESCRIPTOR. Note: For a Pub/Sub source, you must specify the event_timestamp field as the timestamp_column:

 SELECT
   sr.sales_region,
   TUMBLE_START("INTERVAL 15 SECOND") AS period_start,
   SUM(tr.amount) as amount
 FROM pubsub.topic.`project-id`.transactions AS tr
   INNER JOIN bigquery.table.`project-id`.dataflow_sql_tutorial.us_state_salesregions AS sr
   ON tr.state = sr.state_code
 GROUP BY
   sr.sales_region,
   TUMBLE(tr.event_timestamp, "INTERVAL 15 SECOND")

2. When I create pubsub topic with schema it didnt reflect in dataflow sql whereas when I assign it manually from cloud shell using gcloud data-catalog entries update it reflects in dataflow sql when searching the topic name it showed the schema. So which is the right method to assign schema to pubsub topic

You can use console/gcloud on assigning a schema. However, when using console/gcloud command, these are subjected to the following limitations:

The following are some guidelines about using schemas:

  • You cannot add schemas to existing topics.
  • You can specify a schema only when you create a topic.
  • After a schema is associated with a topic, you cannot update the schema or remove its association with that topic.
  • You can apply the same schema to other new topics.

You can use gcloud data-catalog entries update when updating an existing schema.

3. Data catalog also not showing the schema assigned to the pubsub topic.

You may use gcloud data-catalog entries lookup and let me know.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文