AWS胶Redshift_tmp_dir尺寸生长
据我了解,将数据推向红移时,胶水将数据写入“ Temp” S3位置,然后从那里使用Redshift的副本。
我最近扫描了我们的S3存储桶,并注意到我们的一项工作用RedShift_tmp_dir使用的路径正在增长,而且并不微不足道!
那么,开发人员是否可以在工作结束时清除该位置? 我想我以为胶水过程照顾了一切(我猜是天真!)
As I understand things, when pushing data to Redshift, Glue writes the data to a 'temp' S3 location, and then utilizes Redshift's COPY from there.
I recently scanned our S3 buckets, and noticed that the path one of our jobs uses for redshift_tmp_dir, is growing in size, and not insignificantly !
So is it up the the developer to clear that location out at then end of a job ?
I guess I assumed that the Glue processes took care of everything (naive I guess!)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
最简单的是在S3中设置终身规则以自动清除旧文件。
找到S3存储桶,点击“管理”,您可以添加一个规则以在X天后删除文件。
Easiest would be to set up lifetime rules in S3 to clear out old files automatically.
Find the s3 bucket, hit "management" and you can add a rule to delete the file after X days.