您可以在 AWS Athena 中拥有架构或文件夹结构吗?
我正在将整个雪花数据库复制到 S3 中,以便通过 Athena 查看。我想保留架构/层次结构,以便相应的查询不会更改。为此,S3 中的所有文件都已正确组织,如下所示
DataBase/Schema/Folder/Table/{parquet files}
当我使用 Glue 抓取时,它们最终都位于同一级别的一个数据库中。 Athena 中是否可以有类似的文件夹结构?
现在 Athena 中的所有查询都像
Select *
FROM database.table
我想要的
Select *
FROM database.schema.folder.table
I am copying an entire snowflake DB into S3 to be viewed through Athena. I would like to preserve the schema/hierarchy so that the corresponding queries do not change. All the files are organized properly for this in S3 as follows
DataBase/Schema/Folder/Table/{parquet files}
When I crawl with Glue they all end up in one DB at the same level. Is it possible to have a similar folder structure in Athena?
Right now all queries in Athena are like
Select *
FROM database.table
I would like to have
Select *
FROM database.schema.folder.table
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Athena 是一个数据库,正如您所指出的,Athena 中没有层次结构、模式或文件夹的概念。
数据库和架构组成 Snowflake 中的命名空间。如果您的目的只是拥有一个类似的命名空间,您可以将 Snowflake 数据库
d1
和架构名称s1
结合起来,在 Athena中创建扁平化逻辑分组d1_s1。然后您可以执行以下操作:
此外,数据库名称中可以包含的唯一特殊字符是 下划线,因此确实没有其他方法可以保留结构或现有查询。至少,这种格式足够接近,应该很容易以编程方式修复现有查询(例如,使用正则表达式将
abc
替换为a_b.c
)。然而,仍然会有差异。例如,授予Snowflake 数据库和架构的管理方式不同。模式还有一个
托管访问
的概念。这在雅典娜是不可能的。The only logical grouping of tables available in Athena is a
database
, and as you have indicated, there is no concept of hierarchy, schemas, or folders in Athena.Database and schema comprise a namespace in Snowflake. If your intention is to simply have a similar namespace, what you can do is combine the Snowflake database
d1
and schema names1
to create a flattened logical grouping in Athenad1_s1
. Then you can do:Also, the only special character you can have in the database name is an underscore, so there really is no other way to preserve the structure or the existing queries. At least, this way the format is close enough that it should be easy enough to programmatically fix the existing queries (e.g., using regex to replace
a.b.c
witha_b.c
).However, there will still be differences. For example, grants are managed differently for Snowflake databases and schemas. Schemas also have a concept of
managed access
. This will not be possible in Athena.