Python - S3 使用 download_fileobj 下载文件
以下函数使用进度回调从 S3 下载文件。代码运行正常,但是当达到100%时,需要很长时间才能返回。当文件很大时,问题会变得更糟。我认为 f.seek(0)
导致了这个问题,但我不知道如何解决它。在某些情况下删除它会导致错误。
def download(s3_client, s3_object_key):
meta_data = s3_client.head_object(Bucket=BUCKET, Key=s3_object_key)
total_length = int(meta_data.get('ContentLength', 0))
downloaded = 0
def progress(chunk):
nonlocal downloaded
downloaded += chunk
done = int(50 * downloaded / total_length)
sys.stdout.write("\r[%s%s] %%%s " % ('=' * done, ' ' * (50 - done), round(downloaded / total_length * 100, 2)))
sys.stdout.flush()
print(f'Downloading {s3_object_key}')
f = io.BytesIO()
s3_client.download_fileobj(BUCKET, s3_object_key, f, Callback=progress)
f.seek(0) # <---- Could be the cause
print('Done.')
return f
The following function downloads a file from S3 using a progress callback. The code works fine, but when it reaches 100%, it takes a long time to return. The problem gets worse when the file is large. I think that f.seek(0)
is causing the issue, but I'm not sure how to fix it. Removing it causes error in some cases.
def download(s3_client, s3_object_key):
meta_data = s3_client.head_object(Bucket=BUCKET, Key=s3_object_key)
total_length = int(meta_data.get('ContentLength', 0))
downloaded = 0
def progress(chunk):
nonlocal downloaded
downloaded += chunk
done = int(50 * downloaded / total_length)
sys.stdout.write("\r[%s%s] %%%s " % ('=' * done, ' ' * (50 - done), round(downloaded / total_length * 100, 2)))
sys.stdout.flush()
print(f'Downloading {s3_object_key}')
f = io.BytesIO()
s3_client.download_fileobj(BUCKET, s3_object_key, f, Callback=progress)
f.seek(0) # <---- Could be the cause
print('Done.')
return f
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论