对Databricks Delta Live Tables中Fields Structfield的模棱两可的引用

发布于 2025-02-11 22:48:33 字数 2427 浏览 1 评论 0原文

我已经设置了自动加载程序,以定期读取JSON文件并将其存储在DATABRICKS中的Delta Live Tables的名为Fixture_raw的“青铜”表中。这可以正常工作,JSON数据存储在指定的表中,但是当我添加一个称为FIXETURE_PEREDED的“银”表并尝试从青铜表中提取一些JSON元素时,我会遇到一个错误:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(id,LongType,true), StructField(id,LongType,true)

如何解决此 问题?

Delta实时表代码:

CREATE OR REFRESH STREAMING LIVE TABLE fixture_raw AS 
SELECT *, input_file_name() AS InputFile, now() AS LoadTime FROM cloud_files(
  "/mnt/input/fixtures/", 
  "json",
  map(
    "cloudFiles.inferColumnTypes", "true",
    "cloudFiles.schemaLocation", "/mnt/dlt/schema/fixture",
    "cloudFiles.schemaEvolutionMode", "addNewColumns"
  )
);

CREATE OR REFRESH LIVE TABLE fixture_prepared AS
WITH FixtureData (
  SELECT 
    explode(response) AS FixtureJson
  FROM live.fixture_raw
)
SELECT
  FixtureJson.fixture.id AS FixtureID,
  FixtureJson.fixture.date AS StartTime,
  FixtureJson.fixture.venue.name AS Venue,
  FixtureJson.teams.home.id AS HomeTeamID,
  FixtureJson.teams.home.name AS HomeTeamName,
  FixtureJson.teams.away.id AS AwayTeamID,
  FixtureJson.teams.away.name AS AwayTeamName
FROM FixtureData;

JSON数据:

{
    "get": "fixtures",
    "parameters": {
        "league": "39",
        "season": "2022"
    },
    "response": [
        {
            "fixture": {
                "id": 867946,
                "date": "2022-08-05T19:00:00+00:00",
                "venue": {
                    "id": 525,
                    "name": "Selhurst Park"
                }
            },
            "teams": {
                "home": {
                    "id": 52,
                    "name": "Crystal Palace"
                },
                "away": {
                    "id": 42,
                    "name": "Arsenal"
                }
            }
        },
        {
            "fixture": {
                "id": 867947,
                "date": "2022-08-06T11:30:00+00:00",
                "venue": {
                    "id": 535,
                    "name": "Craven Cottage"
                }
            },
            "teams": {
                "home": {
                    "id": 36,
                    "name": "Fulham"
                },
                "away": {
                    "id": 40,
                    "name": "Liverpool"
                }
            }
        }
    ]
}

I have set up Auto Loader to regularly read json files and store them in a "bronze" table called fixture_raw using Delta Live Tables in Databricks. This works fine and the json data is stored in the specified table, but when I add a "silver" table called fixture_prepared and try to extract some of the json elements from the bronze table, I get an error:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(id,LongType,true), StructField(id,LongType,true)

How can I get around this?

Delta Live Table code:

CREATE OR REFRESH STREAMING LIVE TABLE fixture_raw AS 
SELECT *, input_file_name() AS InputFile, now() AS LoadTime FROM cloud_files(
  "/mnt/input/fixtures/", 
  "json",
  map(
    "cloudFiles.inferColumnTypes", "true",
    "cloudFiles.schemaLocation", "/mnt/dlt/schema/fixture",
    "cloudFiles.schemaEvolutionMode", "addNewColumns"
  )
);

CREATE OR REFRESH LIVE TABLE fixture_prepared AS
WITH FixtureData (
  SELECT 
    explode(response) AS FixtureJson
  FROM live.fixture_raw
)
SELECT
  FixtureJson.fixture.id AS FixtureID,
  FixtureJson.fixture.date AS StartTime,
  FixtureJson.fixture.venue.name AS Venue,
  FixtureJson.teams.home.id AS HomeTeamID,
  FixtureJson.teams.home.name AS HomeTeamName,
  FixtureJson.teams.away.id AS AwayTeamID,
  FixtureJson.teams.away.name AS AwayTeamName
FROM FixtureData;

Json data:

{
    "get": "fixtures",
    "parameters": {
        "league": "39",
        "season": "2022"
    },
    "response": [
        {
            "fixture": {
                "id": 867946,
                "date": "2022-08-05T19:00:00+00:00",
                "venue": {
                    "id": 525,
                    "name": "Selhurst Park"
                }
            },
            "teams": {
                "home": {
                    "id": 52,
                    "name": "Crystal Palace"
                },
                "away": {
                    "id": 42,
                    "name": "Arsenal"
                }
            }
        },
        {
            "fixture": {
                "id": 867947,
                "date": "2022-08-06T11:30:00+00:00",
                "venue": {
                    "id": 535,
                    "name": "Craven Cottage"
                }
            },
            "teams": {
                "home": {
                    "id": 36,
                    "name": "Fulham"
                },
                "away": {
                    "id": 40,
                    "name": "Liverpool"
                }
            }
        }
    ]
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

杀お生予夺 2025-02-18 22:48:33

分配数据框的大小和调用数据框之间是有区别的。在加入之前,请检查分配的数据帧大小并调用数据框。请仔细阅读官方文件。我在环境中使用示例代码遵循相同的方案。我添加了一张银桌,对我来说很好,没有错误。关注此

参考:

https://learn.microsoft.com/en-us/azure/databricks/data-engineering/delta-live-tables/delta-live-tables/delta-live-tables-tables-tables-tables-tables-tables-tables-tables-tables-tables-xquickstart#sql

delta live tables演示:现代软件工程ETL处理。

There is a difference between assigning the size of the data frame and calling the dataframe. Kindly check the assigning the dataframe size and calling the dataframe before joining. Kindly go through the official documentation. I followed the same scenario with the sample code in my environment. I added a silver table it's working fine for me without error. Follow this GitHub reference it has detailed information.

Reference:

https://learn.microsoft.com/en-us/azure/databricks/data-engineering/delta-live-tables/delta-live-tables-quickstart#sql

Delta Live Tables Demo: Modern software engineering for ETL processing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文