在 AWS Athena 中,
我尝试连接多个表,然后使用分区键保存它。
运行 create table 后
select *
from t1
union all
select *
from t2
select *
from t3
在控制台上
create table db.table_name
with(
format='parquet',
external_location=...
) AS
select *
from t1
union all
select *
from t2
select *
from t3;
会创建这样的查询,但我想添加按列分区。我尝试
在顶部和底部添加分区。还保存了查询结果,然后使用 CREATE EXTERNAL TABLE
命令从中创建了新表(这有效,但返回空行 -> 即使在运行 MSCK REPAIR
后)
来自 https://aws.amazon.com/premiumsupport/knowledge-center/athena-create-use-partitioned-tables/ 看来我需要按分区将数据保存到 S3 中,因此在 Bucket1 中它将如果“年份”列是分区列,则有 Bucket1/2021、Bucket1/2022。正确的?如果是,创建分区存储桶是否有效?
From AWS Athena
I am trying to concatenate multiple tables then save it with partitoned key.
after running
select *
from t1
union all
select *
from t2
select *
from t3
create table on the console creates query like this,
create table db.table_name
with(
format='parquet',
external_location=...
) AS
select *
from t1
union all
select *
from t2
select *
from t3;
But I want to add partitoned by column. I've tried
adding partitioned by on top and bottom. Also saved query result then created new table from that using CREATE EXTERNAL TABLE
command (this works but return empty row -> even after running MSCK REPAIR
)
From https://aws.amazon.com/premiumsupport/knowledge-center/athena-create-use-partitioned-tables/ It seems like I need to save data into S3 by partitions so in bucket1 it will have bucket1/2021, bucket1/2022 if 'year' column is the partition column. Correct? If yes, is there efficient to create partitioned buckets?
发布评论
评论(1)
我已经使用此方法成功创建了新的分区表:
I have successfully created new, partitioned tables by using this method: