将数据框上传到MS SQL Server中的表的最快方法?
我一直在尝试将巨大的数据框架上传到SQL Server中的表格,数据框架本身包含1m+行,具有70多列以上,问题是,通过尝试多个代码,将需要40分钟或更长时间才能上传。有什么最快的方法吗?这是我尝试使用的几个代码:
使用BCPANDAS需要40分钟:
from bcpandas import SqlCreds, to_sql
creds = SqlCreds(
'server',
'schema',
'user',
'password'
)
to_sql(datadrameHuge,'table_test',creds,index = False, if_exists='append',schema='gcp')
使用SQLalchemy需要一个多小时:
import urllib
import sqlalchemy
import pyodbc
server = 'servertest'
database = 'dbtest'
username = 'untest'
password = 'passwordtest'
driver = 'ODBC Driver 17 for SQL Server'
params = 'DRIVER='+driver + ';SERVER='+server + ';PORT=1433;DATABASE=' + database + ';UID=' + username + ';PWD=' + password
db_params = urllib.parse.quote_plus(params)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(db_params))
pyodbc.drivers()
from sqlalchemy import event
@event.listens_for(engine, "before_cursor_execute")
def receive_before_cursor_execute(
conn, cursor, statement, params, context, executemany
):
if executemany:
cursor.fast_executemany = True
dataframeHuge.to_sql('testtable', engine, index=False, if_exists="append", schema="dbo")
有什么更快的方法可以做到这一点?连接速度也会影响上传吗?
I've been trying to upload a huge dataframe to table in SQL Server, the dataframe itself contains 1M+ rows with more than 70+ columns, the issue is that by trying multiple codes it takes 40 minutes or more to upload it. Is there a fastest way to do so? Here is a couple of codes that I've tried to use:
Using BCPandas takes 40 minutes:
from bcpandas import SqlCreds, to_sql
creds = SqlCreds(
'server',
'schema',
'user',
'password'
)
to_sql(datadrameHuge,'table_test',creds,index = False, if_exists='append',schema='gcp')
Using SQLAlchemy takes more than an hour:
import urllib
import sqlalchemy
import pyodbc
server = 'servertest'
database = 'dbtest'
username = 'untest'
password = 'passwordtest'
driver = 'ODBC Driver 17 for SQL Server'
params = 'DRIVER='+driver + ';SERVER='+server + ';PORT=1433;DATABASE=' + database + ';UID=' + username + ';PWD=' + password
db_params = urllib.parse.quote_plus(params)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(db_params))
pyodbc.drivers()
from sqlalchemy import event
@event.listens_for(engine, "before_cursor_execute")
def receive_before_cursor_execute(
conn, cursor, statement, params, context, executemany
):
if executemany:
cursor.fast_executemany = True
dataframeHuge.to_sql('testtable', engine, index=False, if_exists="append", schema="dbo")
Is there a faster way to do this? Also does the connection speed affects the upload?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论