有没有办法在启用SQLalchemy Orm对象时避免循环?
我目前正在使用SQLalchemy Core来收集一些异源数据库中的数据,运行一些计算,然后使用SQLalchemy ORM模型在新数据库中写入数据。 这是时间分解:
开始-2022-06-27 13:53:50.967981
结束B3DB INIT -0:00:02.276641
结束核心读取-0:00:00.686167
结束计算-0:00:00.577456
结束dadodb init -0:00:00.063829
END ORM Intancing -0:00:03.511613
结束dadodb提交-0:00:01.050192
总计-0:00:08.165898
最高的时间阶段是通过大约要插入的数据的1K行循环。
有两个表,这是我在做的:
undrl_asset = Asset(
timestamp = datetime.now(),
tradedate = tgtdate,
trckrsymb = undrl_ticker,
openprice = undrl_data.iloc[0]['UNDRL_PREABE'],
maxprice = undrl_data.iloc[0]['UNDRL_PREMAX'],
minprice = undrl_data.iloc[0]['UNDRL_PREMIN'],
avgprice = undrl_data.iloc[0]['UNDRL_PREMED'],
lastprice = undrl_data.iloc[0]['UNDRL_PREULT'],
buyoffer = undrl_data.iloc[0]['UNDRL_PREOFC'],
selloffer = undrl_data.iloc[0]['UNDRL_PREOFV'],
numdeals = int(undrl_data.iloc[0]['UNDRL_TOTNEG']),
totamnt = int(undrl_data.iloc[0]['UNDRL_QUATOT']),
totcash = undrl_data.iloc[0]['UNDRL_VOLTOT'],
vh = pd.to_numeric(undrl_data.iloc[0]['UNDRL_VH'])
)
for i in range(num_calls):
this_option = Derivative(
timestamp = datetime.now(),
tradedate = tgtdate,
trckrsymb = call_raw_data.iloc[i]['OPT_TRCKR_SYMB'],
undrlasst = undrl_ticker,
sgmntname = call_raw_data.iloc[i]['OPT_SEGMENT'],
expirdate = call_raw_data.iloc[i]['OPT_EXP_DATE'],
exrcprice = call_raw_data.iloc[i]['OPT_STRIKE'],
optnstyle = str(call_raw_data.iloc[i]['OPT_STYLE'])[0:1],
mrktprice = call_raw_data.iloc[i]['OPT_PREULT'] if call_raw_data.iloc[i]['OPT_PREULT'] > 0 else None,
dayqntneg = call_raw_data.iloc[i]['OPT_VLMNEG'] if call_raw_data.iloc[i]['OPT_VLMNEG'] > 0 else None,
cvrdpos = int(call_raw_data.iloc[i]['OPT_COVERED']),
blkspos = int(call_raw_data.iloc[i]['OPT_BLOCKD']),
uncvpos = int(call_raw_data.iloc[i]['OPT_UNCOVRD']),
totlpos = int(call_raw_data.iloc[i]['OPT_TOT_OI']),
holdqty = int(call_raw_data.iloc[i]['OPT_TITULRS']),
writqty = int(call_raw_data.iloc[i]['OPT_LANCDRS']),
iq = call_raw_data.iloc[i]['OPT_TITULRS']/call_raw_data.iloc[i]['OPT_LANCDRS'],
timtoexp = call_raw_data.iloc[i]['CALC_TIME_TO_EXP'],
moneynes = call_raw_data.iloc[i]['CALC_MNYNS'],
impldvol = call_raw_data.iloc[i]['CALC_IV'] if call_raw_data.iloc[i]['CALC_IV'] > 0 else None,
polycofa = call_raw_data.iloc[i]['CALC_P_A'],
polycofb = call_raw_data.iloc[i]['CALC_P_B'],
polycofc = call_raw_data.iloc[i]['CALC_P_C'],
chosnvol = call_raw_data.iloc[i]['CALC_VOL'],
optdelta = call_raw_data.iloc[i]['CALC_DELTA'],
optgamma = call_raw_data.iloc[i]['CALC_GAMMA'],
optpctgm = call_raw_data.iloc[i]['CALC_PCT_GAMMA'],
opttheta = call_raw_data.iloc[i]['CALC_THETA'],
optbgt = call_raw_data.iloc[i]['CALC_BGT'] if call_raw_data.iloc[i]['CALC_BGT'] > 0 else None,
totdelta = int(call_raw_data.iloc[i]['TOT_DELTA']),
totgamma = int(call_raw_data.iloc[i]['TOT_GAMMA']),
ttpctgmm = int(call_raw_data.iloc[i]['TOT_PCT_GAMMA']),
tottheta = int(call_raw_data.iloc[i]['TOT_THETA']),
totbgt = int(call_raw_data.iloc[i]['TOT_BGT'])
)
options_array.append(this_option)
for i in range(num_puts):
this_option = Derivative(
timestamp = datetime.now(),
tradedate = tgtdate,
trckrsymb = put_raw_data.iloc[i]['OPT_TRCKR_SYMB'],
undrlasst = undrl_ticker,
sgmntname = put_raw_data.iloc[i]['OPT_SEGMENT'],
expirdate = put_raw_data.iloc[i]['OPT_EXP_DATE'],
exrcprice = put_raw_data.iloc[i]['OPT_STRIKE'],
mrktprice = put_raw_data.iloc[i]['OPT_PREULT'] if put_raw_data.iloc[i]['OPT_PREULT'] > 0 else None ,
dayqntneg = put_raw_data.iloc[i]['OPT_VLMNEG'] if put_raw_data.iloc[i]['OPT_VLMNEG'] > 0 else None ,
cvrdpos = int(put_raw_data.iloc[i]['OPT_COVERED']),
blkspos = int(put_raw_data.iloc[i]['OPT_BLOCKD']),
uncvpos = int(put_raw_data.iloc[i]['OPT_UNCOVRD']),
totlpos = int(put_raw_data.iloc[i]['OPT_TOT_OI']),
holdqty = int(put_raw_data.iloc[i]['OPT_TITULRS']),
writqty = int(put_raw_data.iloc[i]['OPT_LANCDRS']),
iq = put_raw_data.iloc[i]['OPT_TITULRS']/put_raw_data.iloc[i]['OPT_LANCDRS'],
timtoexp = put_raw_data.iloc[i]['CALC_TIME_TO_EXP'],
moneynes = put_raw_data.iloc[i]['CALC_MNYNS'],
impldvol = put_raw_data.iloc[i]['CALC_IV'] if put_raw_data.iloc[i]['CALC_IV'] > 0 else None,
polycofa = put_raw_data.iloc[i]['CALC_P_A'],
polycofb = put_raw_data.iloc[i]['CALC_P_B'],
polycofc = put_raw_data.iloc[i]['CALC_P_C'],
chosnvol = put_raw_data.iloc[i]['CALC_VOL'],
optdelta = put_raw_data.iloc[i]['CALC_DELTA'],
optgamma = put_raw_data.iloc[i]['CALC_GAMMA'],
optpctgm = put_raw_data.iloc[i]['CALC_PCT_GAMMA'],
opttheta = put_raw_data.iloc[i]['CALC_THETA'],
optbgt = put_raw_data.iloc[i]['CALC_BGT'] if put_raw_data.iloc[i]['CALC_BGT'] > 0 else None,
totdelta = int(put_raw_data.iloc[i]['TOT_DELTA']),
totgamma = int(put_raw_data.iloc[i]['TOT_GAMMA']),
ttpctgmm = int(put_raw_data.iloc[i]['TOT_PCT_GAMMA']),
tottheta = int(put_raw_data.iloc[i]['TOT_THETA']),
totbgt = int(put_raw_data.iloc[i]['TOT_BGT'])
)
options_array.append(this_option)
有没有办法以矢量化的方式声明ORM对象或对象进行代码,从而避免创建新实例的循环?
I am currently using SQLAlchemy CORE to collect data from some heterogenous databases, run some calculations and then write the data in a new database using SQLAlchemy ORM model.
Here is the time breakdown:
Start -- 2022-06-27 13:53:50.967981
End B3DB Init -- 0:00:02.276641
End CORE Read -- 0:00:00.686167
End Calculations -- 0:00:00.577456
End DADODB Init -- 0:00:00.063829
End ORM Instancing -- 0:00:03.511613
End DADODB Commit -- 0:00:01.050192
Total -- 0:00:08.165898
The most time costly phase is looping through the 1K rows approximately of data to be inserted.
There are two tables and this is what I'm doing:
undrl_asset = Asset(
timestamp = datetime.now(),
tradedate = tgtdate,
trckrsymb = undrl_ticker,
openprice = undrl_data.iloc[0]['UNDRL_PREABE'],
maxprice = undrl_data.iloc[0]['UNDRL_PREMAX'],
minprice = undrl_data.iloc[0]['UNDRL_PREMIN'],
avgprice = undrl_data.iloc[0]['UNDRL_PREMED'],
lastprice = undrl_data.iloc[0]['UNDRL_PREULT'],
buyoffer = undrl_data.iloc[0]['UNDRL_PREOFC'],
selloffer = undrl_data.iloc[0]['UNDRL_PREOFV'],
numdeals = int(undrl_data.iloc[0]['UNDRL_TOTNEG']),
totamnt = int(undrl_data.iloc[0]['UNDRL_QUATOT']),
totcash = undrl_data.iloc[0]['UNDRL_VOLTOT'],
vh = pd.to_numeric(undrl_data.iloc[0]['UNDRL_VH'])
)
for i in range(num_calls):
this_option = Derivative(
timestamp = datetime.now(),
tradedate = tgtdate,
trckrsymb = call_raw_data.iloc[i]['OPT_TRCKR_SYMB'],
undrlasst = undrl_ticker,
sgmntname = call_raw_data.iloc[i]['OPT_SEGMENT'],
expirdate = call_raw_data.iloc[i]['OPT_EXP_DATE'],
exrcprice = call_raw_data.iloc[i]['OPT_STRIKE'],
optnstyle = str(call_raw_data.iloc[i]['OPT_STYLE'])[0:1],
mrktprice = call_raw_data.iloc[i]['OPT_PREULT'] if call_raw_data.iloc[i]['OPT_PREULT'] > 0 else None,
dayqntneg = call_raw_data.iloc[i]['OPT_VLMNEG'] if call_raw_data.iloc[i]['OPT_VLMNEG'] > 0 else None,
cvrdpos = int(call_raw_data.iloc[i]['OPT_COVERED']),
blkspos = int(call_raw_data.iloc[i]['OPT_BLOCKD']),
uncvpos = int(call_raw_data.iloc[i]['OPT_UNCOVRD']),
totlpos = int(call_raw_data.iloc[i]['OPT_TOT_OI']),
holdqty = int(call_raw_data.iloc[i]['OPT_TITULRS']),
writqty = int(call_raw_data.iloc[i]['OPT_LANCDRS']),
iq = call_raw_data.iloc[i]['OPT_TITULRS']/call_raw_data.iloc[i]['OPT_LANCDRS'],
timtoexp = call_raw_data.iloc[i]['CALC_TIME_TO_EXP'],
moneynes = call_raw_data.iloc[i]['CALC_MNYNS'],
impldvol = call_raw_data.iloc[i]['CALC_IV'] if call_raw_data.iloc[i]['CALC_IV'] > 0 else None,
polycofa = call_raw_data.iloc[i]['CALC_P_A'],
polycofb = call_raw_data.iloc[i]['CALC_P_B'],
polycofc = call_raw_data.iloc[i]['CALC_P_C'],
chosnvol = call_raw_data.iloc[i]['CALC_VOL'],
optdelta = call_raw_data.iloc[i]['CALC_DELTA'],
optgamma = call_raw_data.iloc[i]['CALC_GAMMA'],
optpctgm = call_raw_data.iloc[i]['CALC_PCT_GAMMA'],
opttheta = call_raw_data.iloc[i]['CALC_THETA'],
optbgt = call_raw_data.iloc[i]['CALC_BGT'] if call_raw_data.iloc[i]['CALC_BGT'] > 0 else None,
totdelta = int(call_raw_data.iloc[i]['TOT_DELTA']),
totgamma = int(call_raw_data.iloc[i]['TOT_GAMMA']),
ttpctgmm = int(call_raw_data.iloc[i]['TOT_PCT_GAMMA']),
tottheta = int(call_raw_data.iloc[i]['TOT_THETA']),
totbgt = int(call_raw_data.iloc[i]['TOT_BGT'])
)
options_array.append(this_option)
for i in range(num_puts):
this_option = Derivative(
timestamp = datetime.now(),
tradedate = tgtdate,
trckrsymb = put_raw_data.iloc[i]['OPT_TRCKR_SYMB'],
undrlasst = undrl_ticker,
sgmntname = put_raw_data.iloc[i]['OPT_SEGMENT'],
expirdate = put_raw_data.iloc[i]['OPT_EXP_DATE'],
exrcprice = put_raw_data.iloc[i]['OPT_STRIKE'],
mrktprice = put_raw_data.iloc[i]['OPT_PREULT'] if put_raw_data.iloc[i]['OPT_PREULT'] > 0 else None ,
dayqntneg = put_raw_data.iloc[i]['OPT_VLMNEG'] if put_raw_data.iloc[i]['OPT_VLMNEG'] > 0 else None ,
cvrdpos = int(put_raw_data.iloc[i]['OPT_COVERED']),
blkspos = int(put_raw_data.iloc[i]['OPT_BLOCKD']),
uncvpos = int(put_raw_data.iloc[i]['OPT_UNCOVRD']),
totlpos = int(put_raw_data.iloc[i]['OPT_TOT_OI']),
holdqty = int(put_raw_data.iloc[i]['OPT_TITULRS']),
writqty = int(put_raw_data.iloc[i]['OPT_LANCDRS']),
iq = put_raw_data.iloc[i]['OPT_TITULRS']/put_raw_data.iloc[i]['OPT_LANCDRS'],
timtoexp = put_raw_data.iloc[i]['CALC_TIME_TO_EXP'],
moneynes = put_raw_data.iloc[i]['CALC_MNYNS'],
impldvol = put_raw_data.iloc[i]['CALC_IV'] if put_raw_data.iloc[i]['CALC_IV'] > 0 else None,
polycofa = put_raw_data.iloc[i]['CALC_P_A'],
polycofb = put_raw_data.iloc[i]['CALC_P_B'],
polycofc = put_raw_data.iloc[i]['CALC_P_C'],
chosnvol = put_raw_data.iloc[i]['CALC_VOL'],
optdelta = put_raw_data.iloc[i]['CALC_DELTA'],
optgamma = put_raw_data.iloc[i]['CALC_GAMMA'],
optpctgm = put_raw_data.iloc[i]['CALC_PCT_GAMMA'],
opttheta = put_raw_data.iloc[i]['CALC_THETA'],
optbgt = put_raw_data.iloc[i]['CALC_BGT'] if put_raw_data.iloc[i]['CALC_BGT'] > 0 else None,
totdelta = int(put_raw_data.iloc[i]['TOT_DELTA']),
totgamma = int(put_raw_data.iloc[i]['TOT_GAMMA']),
ttpctgmm = int(put_raw_data.iloc[i]['TOT_PCT_GAMMA']),
tottheta = int(put_raw_data.iloc[i]['TOT_THETA']),
totbgt = int(put_raw_data.iloc[i]['TOT_BGT'])
)
options_array.append(this_option)
Is there a way to declare the ORM object or code the object instancing in a vectorized manner, avoiding the loops creating new instances?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我尝试了戈德的建议,这确实做到了。
开始-2022-06-29 14:44:00.847775
END B3DB INIT-0:00:00:02.259717
结束核心读取-0:00:00:00.485671
结束计算 - 0:00:
00.582443 :00.056849
END ORM Intancing -0:00:03.634285
END DADODB COMMIT- 0:00:00.963426
END DF.TO_SQL()提交 - 0:00.206448
总计 - 0:00:00:00:00:0:0:0:
08.188839数据框架,将其放入ORM实例和写作将大约需要4.5秒。仅运行DF.TO_SQL()方法将其完成为0.21s。
I have tried Gord's suggestion and that indeed did the trick.
Start -- 2022-06-29 14:44:00.847775
End B3DB Init -- 0:00:02.259717
End CORE Read -- 0:00:00.485671
End Calculations -- 0:00:00.582443
End DADODB Init -- 0:00:00.056849
End ORM Instancing -- 0:00:03.634285
End DADODB Commit -- 0:00:00.963426
End DF.to_sql() Commit -- 0:00:00.206448
Total -- 0:00:08.188839
Starting with the data in a dataframe, putting it into ORM instances and writing would take roughly 4.5s. Just running the DF.to_SQL() method gets it done in 0.21s.