为什么我可以在AWS胶水中的自定义变换中的输出架构中查看摄入时间列?
我尝试使用 add_ingestion_time_columns https://docs.aws .amazon.com/glue/最新/dg/aws-glue-api-api-crawler-pyspark-extensions-extensions-glue-context.html#aws-glue-api-api-api-crawler-crawler-pyspark-extensions-extensions-glue-lue-contensext-context-add-add-add-ingestion time -columns )。 我在AWS胶水中创建了一个作业,在其中加入了两个CSV文件,然后创建了一个自定义转换,尝试将这些新的时间列添加到我的输出中:
def MyTransform (glueContext, dfc) -> DynamicFrameCollection:
FirstDataFrame = dfc.select(list(dfc.keys())[0]).toDF()
dynamic_frame = DynamicFrame.fromDF(glueContext.add_ingestion_time_columns(FirstDataFrame, "hour"), glueContext, "DynamicFrameDateAndHour" )
return DynamicFrameCollection ({"CustomTrasform": dynamic_frame}, glueContext)
我希望我可以在输出架构中看到新列,但看起来没有什么发生了。有人知道为什么,我应该更改添加此列吗? Visual Job
I try to add ingestion time columns to my Dynamic Frame using add_ingestion_time_columns (https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-glue-context.html#aws-glue-api-crawler-pyspark-extensions-glue-context-add-ingestion-time-columns).
I created a job in AWS Glue where I joined two csv files and then I created a Custom Transform where I try to add these new time columns to my output:
def MyTransform (glueContext, dfc) -> DynamicFrameCollection:
FirstDataFrame = dfc.select(list(dfc.keys())[0]).toDF()
dynamic_frame = DynamicFrame.fromDF(glueContext.add_ingestion_time_columns(FirstDataFrame, "hour"), glueContext, "DynamicFrameDateAndHour" )
return DynamicFrameCollection ({"CustomTrasform": dynamic_frame}, glueContext)
I expected that I can see new columns in the Output schema but it looks like nothing happened. Does anyone know why and what should I change to add this columns?
visual job
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好的,我明白了 - 这些新列必须手动添加,它们会自动填充:转到您的自定义转换中的输出模式 - >编辑 - >添加root键 - >添加第一列,例如ingest_year->申请 - >再次添加根键并添加其他列等。
Ok I got this - these new columns we have to add manually and they fill automatically: go to Output schema in your Custom Transform -> Edit -> Add root key -> add first column like ingest_year -> Apply -> again Add root key and add another columns etc.