使用 python 从 s3 存储桶内文件夹而不是文件夹 --folder 内下载最新文件
我只想从 s3-bucket 文件夹内下载最新文件。实际上,文件夹内有多个文件夹和文件。但我需要仅下载最新日期的文件,然后通过从多个文件夹中进行选择将其上传到一个文件夹中。我引用的是 stackoverflow 源代码中的代码。
这是 s3-bucket 的结构:
S3-Bucket : --folder_1
--abc2022.01.29.csv
--bsv2022.02.18.csv
--test2022.03.04.csv
--Folder_12
--Folder_13
--folder_14
所以基本上,我想从 s3-bucket 文件夹(folder_1)内下载最新文件,而不是从文件夹文件夹(Folder_12、Folder_13、Folder_14)内下载最新文件。
我收到以下错误:
TypeError: 'NoneType' object is not subscriptable
这是用于下载最新文件的代码片段:
def get_most_recent_s3_object(bucket_name, prefix)
s3 = session.client('s3')
paginator = s3.get_paginator( "list_objects_v2" )
page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix, Delimiter="/")
latest = None
for page in page_iterator:
if "Contents" in page:
latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
if latest is None or latest2['LastModified'] > latest['LastModified']:
latest = latest2
with open(latest, 'wb') as f:
s3.download_fileobj(bucket_name, latest, 'C:\\Users\xxxx\\)
return latest
latest = get_most_recent_s3_object(bucket_name='bucket_name_1', prefix='folder_1')
print(latest['Key'])
但我无法将其下载到我的本地路径中。该代码从文件夹内的文件夹而不是从 s3-bucket 内的文件夹 (folder_1) 获取最新文件。
I want to download the latest file from s3-bucket inside folder only. Actually inside folder there are multiple folders along with files. But i need to download only file of latest date and upload it into one folder by selecting from multiple folders. I am referring the code from stackoverflow source code.
Here is structure of s3-bucket :
S3-Bucket : --folder_1
--abc2022.01.29.csv
--bsv2022.02.18.csv
--test2022.03.04.csv
--Folder_12
--Folder_13
--folder_14
So basically, I want to download latest file from s3-bucket inside folder (folder_1) not from inside folder folders (Folder_12,Folder_13,Folder_14).
I am getting the below error :
TypeError: 'NoneType' object is not subscriptable
Here is the code snippet using to download the latest file :
def get_most_recent_s3_object(bucket_name, prefix)
s3 = session.client('s3')
paginator = s3.get_paginator( "list_objects_v2" )
page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix, Delimiter="/")
latest = None
for page in page_iterator:
if "Contents" in page:
latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
if latest is None or latest2['LastModified'] > latest['LastModified']:
latest = latest2
with open(latest, 'wb') as f:
s3.download_fileobj(bucket_name, latest, 'C:\\Users\xxxx\\)
return latest
latest = get_most_recent_s3_object(bucket_name='bucket_name_1', prefix='folder_1')
print(latest['Key'])
But I'm not able to download the into my local path. the code is getting latest file from folders inside folders not from the s3-bucket inside folder (folder_1).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我修改了以下代码以下载 s3-bucket 文件夹内的最新文件,并且工作正常。请找到下面的工作代码片段。
I have modified the below code to download the latest file in s3-bucket inside folder and it's working fine. Please find the below working code snippet.