胶水动态框架解析文本文件,带有¶定界符
我有一个文本文件,如下所示。
HDR¶20200101
BDY¶1¶Jimmy
BDY¶1¶Something
TRL¶123
我想通过过滤标头预告片将其解析为胶水动态数据框。还将标题分配为ID,名称。我尝试了以下代码,但似乎不起作用。
dyf_test = glueContext.create_dynamic_frame.from_options(
format_options={"withHeader": False, "separator": "¶"},
connection_type="s3",
format="csv",
connection_options={
"paths": [
"s3://Files/test.gz"
],
"recurse": True,
})
dyf_test = Filter.apply(
frame=dyf_test,
f=lambda row: (
bool(re.match("HDR", row[0]))
and bool(re.match("TRL", row[0]))
)
)
错误:com.amazonaws.services.glue.util.fatalexception:无法解析文件:test.gz
I have a text file which look like below.
HDR¶20200101
BDY¶1¶Jimmy
BDY¶1¶Something
TRL¶123
I would like to parse it to a Glue Dynamic Dataframe by filtering out the header trailer. Also assign the header as ID, Name. I tried the below code and it doesn't seem to work.
dyf_test = glueContext.create_dynamic_frame.from_options(
format_options={"withHeader": False, "separator": "¶"},
connection_type="s3",
format="csv",
connection_options={
"paths": [
"s3://Files/test.gz"
],
"recurse": True,
})
dyf_test = Filter.apply(
frame=dyf_test,
f=lambda row: (
bool(re.match("HDR", row[0]))
and bool(re.match("TRL", row[0]))
)
)
Error : com.amazonaws.services.glue.util.FatalException: Unable to parse file: test.gz
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论