将数据框上传到MS SQL Server中的表的最快方法?

发布于 2025-01-27 16:11:22 字数 1261 浏览 4 评论 0原文

我一直在尝试将巨大的数据框架上传到SQL Server中的表格,数据框架本身包含1m+行,具有70多列以上,问题是,通过尝试多个代码,将需要40分钟或更长时间才能上传。有什么最快的方法吗?这是我尝试使用的几个代码:

使用BCPANDAS需要40分钟:

from bcpandas import SqlCreds, to_sql
creds = SqlCreds(
  'server',
  'schema',
  'user',
  'password'
  )

to_sql(datadrameHuge,'table_test',creds,index = False, if_exists='append',schema='gcp')

使用SQLalchemy需要一个多小时:

import urllib
import sqlalchemy
import pyodbc

server = 'servertest' 
database = 'dbtest' 
username = 'untest' 
password = 'passwordtest'  
driver = 'ODBC Driver 17 for SQL Server'

params = 'DRIVER='+driver + ';SERVER='+server + ';PORT=1433;DATABASE=' + database + ';UID=' + username + ';PWD=' + password
db_params = urllib.parse.quote_plus(params)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(db_params))

pyodbc.drivers()



from sqlalchemy import event
@event.listens_for(engine, "before_cursor_execute")
def receive_before_cursor_execute(
       conn, cursor, statement, params, context, executemany
        ):
            if executemany:
                cursor.fast_executemany = True

dataframeHuge.to_sql('testtable', engine, index=False, if_exists="append", schema="dbo")

有什么更快的方法可以做到这一点?连接速度也会影响上传吗?

I've been trying to upload a huge dataframe to table in SQL Server, the dataframe itself contains 1M+ rows with more than 70+ columns, the issue is that by trying multiple codes it takes 40 minutes or more to upload it. Is there a fastest way to do so? Here is a couple of codes that I've tried to use:

Using BCPandas takes 40 minutes:

from bcpandas import SqlCreds, to_sql
creds = SqlCreds(
  'server',
  'schema',
  'user',
  'password'
  )

to_sql(datadrameHuge,'table_test',creds,index = False, if_exists='append',schema='gcp')

Using SQLAlchemy takes more than an hour:

import urllib
import sqlalchemy
import pyodbc

server = 'servertest' 
database = 'dbtest' 
username = 'untest' 
password = 'passwordtest'  
driver = 'ODBC Driver 17 for SQL Server'

params = 'DRIVER='+driver + ';SERVER='+server + ';PORT=1433;DATABASE=' + database + ';UID=' + username + ';PWD=' + password
db_params = urllib.parse.quote_plus(params)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(db_params))

pyodbc.drivers()



from sqlalchemy import event
@event.listens_for(engine, "before_cursor_execute")
def receive_before_cursor_execute(
       conn, cursor, statement, params, context, executemany
        ):
            if executemany:
                cursor.fast_executemany = True

dataframeHuge.to_sql('testtable', engine, index=False, if_exists="append", schema="dbo")

Is there a faster way to do this? Also does the connection speed affects the upload?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文