优化pymssql代码

发布于 2024-08-30 14:32:41 字数 467 浏览 5 评论 0原文

我正在使用 pymssql 将记录从 python 插入到 sql server。数据库执行一次查询需要 2 毫秒,但每秒插入 6 行。唯一的问题是在代码方面。如何优化以下代码或插入记录最快的方法是什么。

def save(self):
    conn = pymssql.connect(host=dbHost, user=dbUser, 
                           password=dbPassword, database=dbName, as_dict=True)
    cur = conn.cursor()

    self.pageURL = self.pageURL.replace("'","''")

    query = "my query is there"               
    cur.execute(query)

    conn.commit() 
    conn.close()

i am inserting records to sql server from python using pymssql. The database takes 2 milliseconds to execute a query, yet it insert 6 rows per second. The only problem is at code side. how to optimize following code or what is the fastest method to insert records.

def save(self):
    conn = pymssql.connect(host=dbHost, user=dbUser, 
                           password=dbPassword, database=dbName, as_dict=True)
    cur = conn.cursor()

    self.pageURL = self.pageURL.replace("'","''")

    query = "my query is there"               
    cur.execute(query)

    conn.commit() 
    conn.close()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

雪化雨蝶 2024-09-06 14:32:41

看起来您正在为每个插入创建一个新连接。这可能是速度放缓的主要原因:建立新连接通常非常缓慢。在方法外部创建连接,您应该会看到很大的改进。您还可以在函数外部创建一个游标并重新使用它,这将是另一个加速。

根据您的情况,您可能还希望使用同一事务进行多次插入。这稍微改变了行为——因为事务应该是原子的,要么完全成功,要么完全失败——但是提交事务通常是一个缓慢的操作,因为它必须确保整个操作成功。

It looks like you're creating a new connection per insert there. That's probably the major reason for the slowdown: building new connections is typically quite slow. Create the connection outside the method and you should see a large improvement. You can also create a cursor outside function and re-use it, which will be another speedup.

Depending on your situation, you may also want to use the same transaction for more than a single insertion. This changes the behaviour a little -- since a transaction is supposed to be atomic and either completely succeeds or completely fails -- but committing a transaction is typically a slow operation, because it has to be certain the whole operation succeeded.

谎言 2024-09-06 14:32:41

除了托马斯的伟大建议之外,
我建议您查看 executemany()*,eg

cur.executemany("INSERT INTO persons VALUES(%d, %s)", 
    [ (1, 'John Doe'), (2, 'Jane Doe') ])

...其中 executemany() 的第二个参数应该是要插入的行序列。

这就引出了另一点:
您可能希望将查询和查询参数作为单独的参数发送到execute()executemany()。这将允许 PyMSSQL 模块为您处理任何引用问题。

*executemany()Python DB-API 中所述

.executemany(操作,seq_of_parameters)
准备数据库操作(查询或
命令),然后执行它
所有参数序列或映射
在序列中发现
seq_of_parameters。

In addition to Thomas' great advice,
I'd suggest you look into executemany()*, e.g.:

cur.executemany("INSERT INTO persons VALUES(%d, %s)", 
    [ (1, 'John Doe'), (2, 'Jane Doe') ])

...where the second argument of executemany() should be a sequence of rows to insert.

This brings up another point:
You probably want to send your query and query parameters as separate arguments to either execute() or executemany(). This will allow the PyMSSQL module to handle any quoting issues for you.

*executemany() as described in the Python DB-API:

.executemany(operation,seq_of_parameters)
Prepare a database operation (query or
command) and then execute it against
all parameter sequences or mappings
found in the sequence
seq_of_parameters.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文