当我做很多阅读时,试图将我的头缠住,那里有很多令人困惑的营销信息。
因此,我的公司生活在一个关系数据库的世界中,包括Oracle,MS Access和MS SQL Server。
我们想将数据源合并到数据湖中,尤其是Azure数据湖,该湖被销售为能够存储任何类型的数据,但是一旦进一步阅读,似乎它只能以包括Parquet在内的几种文件格式存储数据CSV等。
所以我的问题是 - 如果我们有一堆关系数据库,我们是否需要
- ETL工具(例如Azure Data Factory)访问源
- 数据库
- 通过 使用ADL作为目标,将数据输入Azure数据湖文件格式之一(CSV,Parquet等)之一,
我是否正确地将数据输出?由于某种原因,我天真地认为您可以抓住数据文件并将其放入数据湖。我只是想确保“转换”源表并将其输出到更常见的文件格式(CSV,Parquet等)中需要一个步骤。
Just trying to wrap my head around this as I was doing a lot of reading and there's a lot of confusing marketing information out there.
So my company lives in a world of relational databases including Oracle, MS Access and MS SQL Server.
We were wanting to consolidate data sources into a data lake, in particular Azure Data Lake which is marketed as being able to store any sort of data, but upon further reading it seems that it can only store data in a few file formats including parquet, CSV etc.
So my question is - If we have a bunch of relational databases, do we need to
- Access the source database via an ETL tool (for example Azure Data Factory)
- Grab the source tables of interest and pull it into ADF
- Convert and finally output the data into one of the Azure Data Lake file formats (csv, parquet etc) using ADL as a target
Did I get this right? For some reason I naively thought you could just grab the data files and drop it into the data lake. I just wanted to make sure that there is a step needed to "convert" the source table(s) and output it into a more common file format (csv, parquet etc).
发布评论
评论(1)
Azure Data Factory仅支持以下格式作为来源。
您可以将“从Oracle”,MS Access和SQL Server作为源连接选择表。但是,您无法选择.dbf,.ACCDB和.MDF文件格式作为ADF中的源。
在此官方文档
参考 - https://learn.microsoft.com/en-us/azure/data-factory/supported-file-file-formats-and-compression-codecs
Azure Data Factory supports only below formats as Source.
You can connect select tables from Oracle, MS Access and SQL Server as Source. But, you can not select .dbf, .accdb and .mdf file formats as source in ADF.
ADLS supported file formats are given in this official documentation
Refer - https://learn.microsoft.com/en-us/azure/data-factory/supported-file-formats-and-compression-codecs