python字典根据日期检查重复项
所以我要循环一个目录并且我正在读取一些 JSON 文件 在这些文件上,我解析出 4 个键,然后创建一个包含所有解析出的数据的 CSV 文件。
碰巧我有重复的条目,所以我想根据日期(较新的)消除重复项,然后重新写入? CSV 不确定如何实现它,
例如:
def mdy_to_ymd(d):
# convert the date into comparable string
cor_date = datetime.strptime(d, '%b %d %Y').strftime('%d/%m/%Y')
return time.strptime(cor_date, "%d/%m/%Y")
def date_converter(date): # convert the date to readable string for csv
return datetime.strptime(date, '%b %d %Y').strftime('%d/%m/%Y')
def csv_generator(path): # creating the csv
list_json = []
ffresult = []
duplicate_dict = {}
for file in os.listdir(path): # iterating through the directory with the files
fresult = []
with open(f"{directory}/{file}", "r") as result: # opening the json file
templates = json.load(result)
hostname_str = file.split(".")
site_code_str = (f"{file[:5]}")
datetime_str3 = (mdy_to_ymd(datetime_str2)) # converting the date to comparable data
duplicate_dict[hostname_str[0]] = datetime_str3
"""?? i am creating a
dictionary which as key has the hostname and as date has the date
but it doesnt work since when there is the same hostname it only updates the current key and there are
not duplicates but it doesnt guarantee there are only the newest based on date"""
fresult.append(site_code_str)
fresult.append(hostname_str[0])
fresult.append((templates["execution_status"]))
fresult.append(date_converter(datetime_str2))
fresult.append(templates["protocol_name"])
fresult.append(templates["protocol_version"])
ffresult.append(fresult)
# i append the values i need into 2 lists
with open("jsondicts.csv", "w") as dst:
writetoit = csv.writer(dst)
writetoit.writerows(csv_generator(directory))
# this is how i write to csv so right now i have duplicate values on the csv
我只想有基于主机名的唯一值,但也只有最新的值 基于日期的唯一数据,当然还有其他解析出的数据(协议名称、站点代码等)
So I am for looping over a directory and I am reading some JSON files
on those files, I parse out 4 keys and then I create a CSV file with all the parsed out data
It happens that I have duplicate entries so I want to eliminate duplicates based on date(newer) and then re-write? the CSV not sure how to implement it
e.g:
def mdy_to_ymd(d):
# convert the date into comparable string
cor_date = datetime.strptime(d, '%b %d %Y').strftime('%d/%m/%Y')
return time.strptime(cor_date, "%d/%m/%Y")
def date_converter(date): # convert the date to readable string for csv
return datetime.strptime(date, '%b %d %Y').strftime('%d/%m/%Y')
def csv_generator(path): # creating the csv
list_json = []
ffresult = []
duplicate_dict = {}
for file in os.listdir(path): # iterating through the directory with the files
fresult = []
with open(f"{directory}/{file}", "r") as result: # opening the json file
templates = json.load(result)
hostname_str = file.split(".")
site_code_str = (f"{file[:5]}")
datetime_str3 = (mdy_to_ymd(datetime_str2)) # converting the date to comparable data
duplicate_dict[hostname_str[0]] = datetime_str3
"""?? i am creating a
dictionary which as key has the hostname and as date has the date
but it doesnt work since when there is the same hostname it only updates the current key and there are
not duplicates but it doesnt guarantee there are only the newest based on date"""
fresult.append(site_code_str)
fresult.append(hostname_str[0])
fresult.append((templates["execution_status"]))
fresult.append(date_converter(datetime_str2))
fresult.append(templates["protocol_name"])
fresult.append(templates["protocol_version"])
ffresult.append(fresult)
# i append the values i need into 2 lists
with open("jsondicts.csv", "w") as dst:
writetoit = csv.writer(dst)
writetoit.writerows(csv_generator(directory))
# this is how i write to csv so right now i have duplicate values on the csv
I want to have only unique values based on hostname but also only the newest
unique ones based on the date of course also the other parsed out data (protocol name, site code, etc)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这解决了我必须使用 pandas lib
this solves it i had to use pandas lib though