如何从Python中的AWS S3中编辑/重命名/删除多个JSON文件中的键和值
我在一个文件夹和(sub文件夹:文件夹A具有sub文件夹b和c等)中的JSON文件中,我想更改所有JSON文件的键的名称,然后删除一些键&值。 首先,我尝试在文件夹中获取文件列表:
def get_s3_list(bucket, prefix):
s3 = boto3.client("s3")
objects = s3.list_objects(Bucket=bucket, Prefix=prefix)
obj_list = [lc['Key'] for lc in objects['Contents']]
return obj_list
s3_list = get_s3_list('bucket', 'prefix')
full_s3_list = [ll.split('/') for ll in s3_list]
json_list_files = []
for sub_list in full_s3_list:
for sb in sub_list:
if sb.endswith('.json') or sb.endswith('.JSON'):
json_list_files.append(sub_list)
然后,我想重命名每个JSON文件中的一些键。例如,一个JSON文件看起来像:
{
"name": "Apple",
"type": "sweet",
"size": "12",
"country": "Germany",
"path": "s1",
"other info": "not known",
}
对于所有文件,我想重命名键并删除一些键& 我知道
{
"name of fruit": "Apple",
"taste": "sweet",
"size": "12",
"path_id": "s1",
}
如何从一个文件中更改一个关键名称,但是我不知道如何将其应用于所有文件以及一个以上的关键名称。我已经尝试过,但是我无法得到我想要的东西:
new_names = { 'name' : 'name of fruit' ,
'type' : 'taste' ,
'size' : 'size',
'path' : 'path_id'
}
for row in json_list_files:
for k, v in new_names.items():
for old_name in row:
if k == old_name:
row[v] = row.pop(old_name)
I have JSON files in a folder and (sub folders: Folder A has sub folder B and C and so on) within AWS S3 bucket and I would like to change the name of the keys of all the JSON files and remove some keys & values.
At first, I tried to get the list of the files within my folder A as:
def get_s3_list(bucket, prefix):
s3 = boto3.client("s3")
objects = s3.list_objects(Bucket=bucket, Prefix=prefix)
obj_list = [lc['Key'] for lc in objects['Contents']]
return obj_list
s3_list = get_s3_list('bucket', 'prefix')
full_s3_list = [ll.split('/') for ll in s3_list]
json_list_files = []
for sub_list in full_s3_list:
for sb in sub_list:
if sb.endswith('.json') or sb.endswith('.JSON'):
json_list_files.append(sub_list)
Then, I want to rename some of the keys in each json file. For instance one json file looks like:
{
"name": "Apple",
"type": "sweet",
"size": "12",
"country": "Germany",
"path": "s1",
"other info": "not known",
}
For all the files, I want to rename the keys and remove some keys & values such as to get
{
"name of fruit": "Apple",
"taste": "sweet",
"size": "12",
"path_id": "s1",
}
I know how to change only one key name from one file but I cannot figure out how to apply that for all the files and for more than one key name. I have tried this but I cannot get what I want at the end:
new_names = { 'name' : 'name of fruit' ,
'type' : 'taste' ,
'size' : 'size',
'path' : 'path_id'
}
for row in json_list_files:
for k, v in new_names.items():
for old_name in row:
if k == old_name:
row[v] = row.pop(old_name)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以首先下载S3存储桶的内容,然后递归通过JSON文件,并使用JSON处理器进行修改,例如
JQ
。完成此操作后,您可以运行AWS S3 Sync
命令,以便将修改的文件上传到存储桶。You can first download the contents of the s3 bucket and then recursively go through the JSON files and modify them using a JSON processor like
jq
. After that is done, you can run theaws s3 sync
command so that the modified files are uploaded to the bucket.