将 athena 结果保存到另一个带有分区的表

发布于 2025-01-11 09:37:41 字数 850 浏览 0 评论 0 原文

在 AWS Athena 中,

我尝试连接多个表,然后使用分区键保存它。

运行 create table 后

select *
  from t1
union all
select *
  from t2
select *
  from t3

在控制台上

create table db.table_name
with(
format='parquet',
external_location=...
) AS
select *
  from t1
union all
select *
  from t2
select *
  from t3;

会创建这样的查询,但我想添加按列分区。我尝试

在顶部和底部添加分区。还保存了查询结果,然后使用 CREATE EXTERNAL TABLE 命令从中创建了新表(这有效,但返回空行 -> 即使在运行 MSCK REPAIR 后)

来自 https://aws.amazon.com/premiumsupport/knowledge-center/athena-create-use-partitioned-tables/ 看来我需要按分区将数据保存到 S3 中,因此在 Bucket1 中它将如果“年份”列是分区列,则有 Bucket1/2021、Bucket1/2022。正确的?如果是,创建分区存储桶是否有效?

From AWS Athena

I am trying to concatenate multiple tables then save it with partitoned key.

after running

select *
  from t1
union all
select *
  from t2
select *
  from t3

create table on the console creates query like this,

create table db.table_name
with(
format='parquet',
external_location=...
) AS
select *
  from t1
union all
select *
  from t2
select *
  from t3;

But I want to add partitoned by column. I've tried

adding partitioned by on top and bottom. Also saved query result then created new table from that using CREATE EXTERNAL TABLE command (this works but return empty row -> even after running MSCK REPAIR)

From https://aws.amazon.com/premiumsupport/knowledge-center/athena-create-use-partitioned-tables/ It seems like I need to save data into S3 by partitions so in bucket1 it will have bucket1/2021, bucket1/2022 if 'year' column is the partition column. Correct? If yes, is there efficient to create partitioned buckets?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南风起 2025-01-18 09:37:41

我已经使用此方法成功创建了新的分区表:

CREATE TABLE my_table
WITH (
 format = 'PARQUET',
 parquet_compression = 'SNAPPY',
 external_location = 's3://bucket/folder/',
 partitioned_by = ARRAY['year']
)
AS
SELECT
  ...

I have successfully created new, partitioned tables by using this method:

CREATE TABLE my_table
WITH (
 format = 'PARQUET',
 parquet_compression = 'SNAPPY',
 external_location = 's3://bucket/folder/',
 partitioned_by = ARRAY['year']
)
AS
SELECT
  ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文