加快executemany cx_orcale下的更新查询速度
我有大约 50 万行需要在 oracle sql server 的两个表中更新。 我将 0.5 分成 100K 行并尝试执行许多,但它永远持续下去。我有10个文件,每个文件有50万。这将花费我很长时间来更新表格。需要建议来改进或加快流程。
我的代码:
insert_batch_size=1000
def UpdateDBWith_Y(Processed_HSMemberFileDf,connection,cursor):
try:
Columns = {'ccd_aadhaar_seeding_status':'Y','CCD_LST_MOD_BY':'SYSTEM'}
Processed_HSMemberFileDf=Processed_HSMemberFileDf.assign(**Columns)
Processed_HSMemberFileDf.drop('ADHAAR_SEEDING_STATUS', axis=1, inplace=True)
ListAcctoDB=['ccd_aadhaar_seeding_status','CCD_LST_MOD_BY','PAN']
Processed_HSMemberFileDf = Processed_HSMemberFileDf[ListAcctoDB]
bind_update=Processed_HSMemberFileDf.values.tolist()
print(len(bind_update))
length=0
bind_update_chunks = [bind_update[x:x+insert_batch_size] for x in range(0, len(bind_update), insert_batch_size)]
for insert_chunk in bind_update_chunks:
start_time = time.time()
_updatequery="update UCIDBA.client_code_dtls set ccd_aadhaar_seeding_status = :1,CCD_LST_MOD_BY=:2,CCD_LST_MOD_DT=SYSDATE where CCD_PAN_NO= :3"
_updatequery_chld="update UCIDBA.client_code_child_dtls set ccc_aadhaar_seeding_status = :1,CCC_LST_MOD_BY=:2,CCC_LST_MOD_DT=SYSDATE where CCC_PAN_NO= :3"
logging.info("UpdateDBWith_Y:->Update Query : |" +_updatequery)
logging.info("UpdateDBWith_Y:->Update Query child: |" +_updatequery_chld)
cursor.executemany(_updatequery,insert_chunk)
connection.commit()
cursor.executemany(_updatequery_chld,insert_chunk)
connection.commit()
length=length+len(insert_chunk)
logging.info("UpdateDBWith_Y:->inserted till : |" +str(length))
print("--- %s seconds ---" % (time.time() - start_time))
return True
except Exception as e :
logging.exception("UpdateDBWith_Y:->: |")
return False
还有其他方法可以加速或使用executemany吗?
I have around 0.5 million rows which needs to be updated in two table of oracle sql server .
I divided 0.5 into 100K rows and tried executemany but it is taking on forever. I have 10 files with 0.5 million in each file. this would take me forever to update in the table. Need suggestion to improve or speed up the process.
My code:
insert_batch_size=1000
def UpdateDBWith_Y(Processed_HSMemberFileDf,connection,cursor):
try:
Columns = {'ccd_aadhaar_seeding_status':'Y','CCD_LST_MOD_BY':'SYSTEM'}
Processed_HSMemberFileDf=Processed_HSMemberFileDf.assign(**Columns)
Processed_HSMemberFileDf.drop('ADHAAR_SEEDING_STATUS', axis=1, inplace=True)
ListAcctoDB=['ccd_aadhaar_seeding_status','CCD_LST_MOD_BY','PAN']
Processed_HSMemberFileDf = Processed_HSMemberFileDf[ListAcctoDB]
bind_update=Processed_HSMemberFileDf.values.tolist()
print(len(bind_update))
length=0
bind_update_chunks = [bind_update[x:x+insert_batch_size] for x in range(0, len(bind_update), insert_batch_size)]
for insert_chunk in bind_update_chunks:
start_time = time.time()
_updatequery="update UCIDBA.client_code_dtls set ccd_aadhaar_seeding_status = :1,CCD_LST_MOD_BY=:2,CCD_LST_MOD_DT=SYSDATE where CCD_PAN_NO= :3"
_updatequery_chld="update UCIDBA.client_code_child_dtls set ccc_aadhaar_seeding_status = :1,CCC_LST_MOD_BY=:2,CCC_LST_MOD_DT=SYSDATE where CCC_PAN_NO= :3"
logging.info("UpdateDBWith_Y:->Update Query : |" +_updatequery)
logging.info("UpdateDBWith_Y:->Update Query child: |" +_updatequery_chld)
cursor.executemany(_updatequery,insert_chunk)
connection.commit()
cursor.executemany(_updatequery_chld,insert_chunk)
connection.commit()
length=length+len(insert_chunk)
logging.info("UpdateDBWith_Y:->inserted till : |" +str(length))
print("--- %s seconds ---" % (time.time() - start_time))
return True
except Exception as e :
logging.exception("UpdateDBWith_Y:->: |")
return False
Is there any other way to speedup or use executemany ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论