在glue的帮助下从s3存储桶.csv文件在aws athena中创建表
由于我是 AWS 服务的新手,实际上我正在尝试从 s3 存储桶 .csv 文件创建 athena 表,并为此创建了爬虫。在我的 Csv 文件中,最初有以下输入数据。
name designation zip code build no address
1 siddarth,james professor 522135 3 mla colony
2 roy,deshmukh software 412230 1 sez apartments
3 viliam,mckesson accountant 628139 10 oakland road
在 athena 中创建表后,我得到以下输出。
name designation zip_code build_no address
1 siddarth james professor
2 roy deshmukh software
3 viliam mckesson contractor
因为无论我拥有什么 csv 文件,数据都不会以正确的格式数据填充。但我所需的输出应该是这样的:
name designation zip code build no address
siddarth,james professor 522135 3 mla colony
roy,deshmukh software 412230 1 sez apartments
viliam,mckesson accountant 628139 10 oakland road
任何人都可以帮助使用 s3 存储桶 .csv 文件格式数据在 athena 中创建表。
As i am new to AWS services, Actually i am trying to create the athena table from s3 bucket .csv file and also created the crawler for that. In my Csv file originally i have the below input data.
name designation zip code build no address
1 siddarth,james professor 522135 3 mla colony
2 roy,deshmukh software 412230 1 sez apartments
3 viliam,mckesson accountant 628139 10 oakland road
after creating the table in athena, i am getting below output.
name designation zip_code build_no address
1 siddarth james professor
2 roy deshmukh software
3 viliam mckesson contractor
As data is not populating as proper format data whatever csv file i have. but my required output should be like :
name designation zip code build no address
siddarth,james professor 522135 3 mla colony
roy,deshmukh software 412230 1 sez apartments
viliam,mckesson accountant 628139 10 oakland road
Could anyone help for to create the table in athena by using the s3 bucket .csv file format data.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您能否提供 CSV 的纯文本示例行?上面显示的肯定不是有效的逗号分隔值,很难据此给出建议。
不管怎样,看起来你的第一列包含逗号。因此,这一切都与如何引用您的值/转义特殊字符以便正确处理有关。
查看
OpenCSVSerDe
和相应文档 这里。这应该有帮助...Could you provide a plain text example line of your CSV? What's shown above are certainly not valid comma separated values and it's hard to give advice based on that.
Anyways, it looks like your first column contains commas. So it's all about how your values are quoted / special characters are escaped in order to be processed correctly.
Have a look at
OpenCSVSerDe
and respective documentation here. This should help...