当前位置：文江博客话题详情

Python mysql-python MySQL

何时使用 MySQLdb 关闭游标

发布于 2024-11-01 18:54:37 字数 241 浏览 4 评论 0 原文

我正在构建一个 WSGI Web 应用程序，并且有一个 MySQL 数据库。我正在使用 MySQLdb，它提供用于执行语句和获取结果的游标。 获取和关闭游标的标准做法是什么？特别是，我的游标应该持续多长时间？我应该为每笔交易获取一个新的游标吗？

我相信您需要在提交连接之前关闭游标。查找不需要中间提交的事务集（这样您就不必为每个事务获取新游标）是否有任何显着的优势？获取新游标是否会产生大量开销，或者这没什么大不了的？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

抱猫软卧 2024-11-08 18:54:37

您可以尝试向模块本身寻求指导，而不是询问什么是标准实践，因为这通常是不清楚和主观的。一般来说，按照其他用户的建议使用 with 关键字是一个好主意，但在这种特定情况下，它可能无法为您提供所需的功能。

从模块的 1.2.5 版本开始，MySQLdb.Connection 实现了上下文管理器协议，包含以下代码 (github)：

def __enter__(self):
    if self.get_autocommit():
        self.query("BEGIN")
    return self.cursor()

def __exit__(self, exc, value, tb):
    if exc:
        self.rollback()
    else:
        self.commit()

已经有几个关于 with 的现有问答，或者您可以阅读理解 Python 的“with”语句，但本质上发生的是 __enter__ 在 with 块的开头执行，__exit__ 在离开 with 块时执行。如果您打算稍后引用该对象，可以使用可选语法 with EXPR as VAR 将 __enter__ 返回的对象绑定到名称。因此，鉴于上述实现，这是查询数据库的简单方法：

connection = MySQLdb.connect(...)
with connection as cursor:            # connection.__enter__ executes at this line
    cursor.execute('select 1;')
    result = cursor.fetchall()        # connection.__exit__ executes after this line
print result                          # prints "((1L,),)"

现在的问题是，退出 with 块后连接和光标的状态是什么？上面显示的 __exit__ 方法仅调用 self.rollback() 或 self.commit()，并且这些方法都不会继续调用close() 方法。游标本身没有定义 __exit__ 方法 - 即使有也没关系，因为 with 仅管理连接。因此，退出 with 块后，连接和游标都保持打开状态。通过将以下代码添加到上面的示例中可以轻松确认这一点：

try:
    cursor.execute('select 1;')
    print 'cursor is open;',
except MySQLdb.ProgrammingError:
    print 'cursor is closed;',
if connection.open:
    print 'connection is open'
else:
    print 'connection is closed'

您应该看到打印到 stdout 的输出“光标已打开；连接已打开”。

我相信您需要在提交连接之前关闭游标。

为什么？ MySQL C API，是 MySQLdb 的基础，不实现任何游标对象，如模块文档中所暗示的："MySQL 不支持游标；但是，游标很容易被模拟。" 事实上，MySQLdb.cursors.BaseCursor 类直接继承自 object 并且在提交/回滚方面对游标没有这样的限制。一位 Oracle 开发人员这样说：

cnx.commit() 在 cur.close() 之前对我来说听起来最合乎逻辑。也许你
可以遵循规则：“如果不再需要，请关闭光标。”
因此在关闭游标之前提交()。最后，为了
Connector/Python，没有太大区别，但是或者其他
可能是数据库。

我希望这与您在该主题上达到的“标准实践”一样接近。

查找不需要中间提交的事务集（这样您就不必为每个事务获取新游标）是否有任何显着优势？

我非常怀疑这一点，并且在尝试这样做时，您可能会引入额外的人为错误。最好决定一个惯例并坚持下去。

获取新游标是否会产生大量开销，或者这没什么大不了的？

开销可以忽略不计，并且根本不涉及数据库服务器；它完全在 MySQLdb 的实现中。您可以查看 github 上的 BaseCursor.__init__ 如果您真的很想知道创建新游标时发生了什么。

回到之前我们讨论with的时候，也许现在你可以理解为什么MySQLdb.Connection类__enter__和__exit__ 方法在每个 with 块中为您提供一个全新的光标对象，并且不必费心跟踪它或在块末尾关闭它。它相当轻量，纯粹是为了您的方便而存在。

如果微观管理光标对象对您来说确实很重要，您可以使用 contextlib.close 来弥补游标对象没有定义__exit__方法的事实。就此而言，您还可以使用它来强制连接对象在退出 with 块时自行关闭。这应该输出“my_curs is close; my_conn is close”：

from contextlib import closing
import MySQLdb

with closing(MySQLdb.connect(...)) as my_conn:
    with closing(my_conn.cursor()) as my_curs:
        my_curs.execute('select 1;')
        result = my_curs.fetchall()
try:
    my_curs.execute('select 1;')
    print 'my_curs is open;',
except MySQLdb.ProgrammingError:
    print 'my_curs is closed;',
if my_conn.open:
    print 'my_conn is open'
else:
    print 'my_conn is closed'

请注意，with opening(arg_obj)不会调用参数对象的__enter__和__exit__方法;它只会在 with 块末尾调用参数对象的 close 方法。（要查看实际效果，只需定义一个类 Foo，其中包含简单的 __enter__、__exit__ 和 close 方法print 语句，并将执行 with Foo(): pass 时发生的情况与执行 with opening(Foo()): pass 时发生的情况进行比较code>。）这有两个重要的含义：

首先，如果启用了自动提交模式，当您使用with连接并提交或回滚时，MySQLdb将在服务器上BEGIN显式事务区块末尾的交易。这些是 MySQLdb 的默认行为，旨在保护您免受 MySQL 立即提交任何和所有 DML 语句的默认行为的影响。 MySQLdb 假设当您使用上下文管理器时，您需要一个事务，并使用显式的 BEGIN 来绕过服务器上的自动提交设置。如果您习惯于使用带连接，您可能会认为自动提交已被禁用，而实际上它只是被绕过。如果您将结束添加到代码中并失去事务完整性，您可能会感到不愉快；您将无法回滚更改，您可能会开始看到并发错误，但原因可能并不明显。

其次，with opening(MySQLdb.connect(user, pass)) as VAR 将连接对象绑定到VAR，与相反使用 MySQLdb.connect(user, pass) 作为 VAR，它将新的游标对象绑定到VAR。在后一种情况下，您将无法直接访问连接对象！相反，您必须使用光标的 connection 属性，该属性提供对原始连接的代理访问。当光标关闭时，其connection属性设置为None。这会导致废弃的连接将一直存在，直到发生以下情况之一：

删除对游标的所有引用游标
超出范围连接
超时连接
通过服务器管理工具手动关闭

您可以通过监视打开来测试此情况连接（在 Workbench 中或通过使用 SHOW PROCESSLIST)，同时逐一执行以下几行：

with MySQLdb.connect(...) as my_curs:
    pass
my_curs.close()
my_curs.connection          # None
my_curs.connection.close()  # throws AttributeError, but connection still open
del my_curs                 # connection will close here

Instead of asking what is standard practice, since that's often unclear and subjective, you might try looking to the module itself for guidance. In general, using the with keyword as another user suggested is a great idea, but in this specific circumstance it may not give you quite the functionality you expect.

As of version 1.2.5 of the module, MySQLdb.Connection implements the context manager protocol with the following code (github):

def __enter__(self):
    if self.get_autocommit():
        self.query("BEGIN")
    return self.cursor()

def __exit__(self, exc, value, tb):
    if exc:
        self.rollback()
    else:
        self.commit()

There are several existing Q&A about with already, or you can read Understanding Python's "with" statement, but essentially what happens is that __enter__ executes at the start of the with block, and __exit__ executes upon leaving the with block. You can use the optional syntax with EXPR as VAR to bind the object returned by __enter__ to a name if you intend to reference that object later. So, given the above implementation, here's a simple way to query your database:

connection = MySQLdb.connect(...)
with connection as cursor:            # connection.__enter__ executes at this line
    cursor.execute('select 1;')
    result = cursor.fetchall()        # connection.__exit__ executes after this line
print result                          # prints "((1L,),)"

The question now is, what are the states of the connection and the cursor after exiting the with block? The __exit__ method shown above calls only self.rollback() or self.commit(), and neither of those methods go on to call the close() method. The cursor itself has no __exit__ method defined – and wouldn't matter if it did, because with is only managing the connection. Therefore, both the connection and the cursor remain open after exiting the with block. This is easily confirmed by adding the following code to the above example:

try:
    cursor.execute('select 1;')
    print 'cursor is open;',
except MySQLdb.ProgrammingError:
    print 'cursor is closed;',
if connection.open:
    print 'connection is open'
else:
    print 'connection is closed'

You should see the output "cursor is open; connection is open" printed to stdout.

I believe you need to close the cursor before committing the connection.

Why? The MySQL C API, which is the basis for MySQLdb, does not implement any cursor object, as implied in the module documentation: "MySQL does not support cursors; however, cursors are easily emulated." Indeed, the MySQLdb.cursors.BaseCursor class inherits directly from object and imposes no such restriction on cursors with regard to commit/rollback. An Oracle developer had this to say:

cnx.commit() before cur.close() sounds most logical to me. Maybe you
can go by the rule: "Close the cursor if you do not need it anymore."
Thus commit() before closing the cursor. In the end, for
Connector/Python, it does not make much difference, but or other
databases it might.

I expect that's as close as you're going to get to "standard practice" on this subject.

Is there any significant advantage to finding sets of transactions that don't require intermediate commits so that you don't have to get new cursors for each transaction?

I very much doubt it, and in trying to do so, you may introduce additional human error. Better to decide on a convention and stick with it.

Is there a lot of overhead for getting new cursors, or is it just not a big deal?

The overhead is negligible, and doesn't touch the database server at all; it's entirely within the implementation of MySQLdb. You can look at BaseCursor.__init__ on github if you're really curious to know what's happening when you create a new cursor.

Going back to earlier when we were discussing with, perhaps now you can understand why the MySQLdb.Connection class __enter__ and __exit__ methods give you a brand new cursor object in every with block and don't bother keeping track of it or closing it at the end of the block. It's fairly lightweight and exists purely for your convenience.

If it's really that important to you to micromanage the cursor object, you can use contextlib.closing to make up for the fact that the cursor object has no defined __exit__ method. For that matter, you can also use it to force the connection object to close itself upon exiting a with block. This should output "my_curs is closed; my_conn is closed":

from contextlib import closing
import MySQLdb

with closing(MySQLdb.connect(...)) as my_conn:
    with closing(my_conn.cursor()) as my_curs:
        my_curs.execute('select 1;')
        result = my_curs.fetchall()
try:
    my_curs.execute('select 1;')
    print 'my_curs is open;',
except MySQLdb.ProgrammingError:
    print 'my_curs is closed;',
if my_conn.open:
    print 'my_conn is open'
else:
    print 'my_conn is closed'

Note that with closing(arg_obj) will not call the argument object's __enter__ and __exit__ methods; it will only call the argument object's close method at the end of the with block. (To see this in action, simply define a class Foo with __enter__, __exit__, and close methods containing simple print statements, and compare what happens when you do with Foo(): pass to what happens when you do with closing(Foo()): pass.) This has two significant implications:

First, if autocommit mode is enabled, MySQLdb will BEGIN an explicit transaction on the server when you use with connection and commit or rollback the transaction at the end of the block. These are default behaviors of MySQLdb, intended to protect you from MySQL's default behavior of immediately committing any and all DML statements. MySQLdb assumes that when you use a context manager, you want a transaction, and uses the explicit BEGIN to bypass the autocommit setting on the server. If you're used to using with connection, you might think autocommit is disabled when actually it was only being bypassed. You might get an unpleasant surprise if you add closing to your code and lose transactional integrity; you won't be able to rollback changes, you may start seeing concurrency bugs and it may not be immediately obvious why.

Second, with closing(MySQLdb.connect(user, pass)) as VAR binds the connection object to VAR, in contrast to with MySQLdb.connect(user, pass) as VAR, which binds a new cursor object to VAR. In the latter case you would have no direct access to the connection object! Instead, you would have to use the cursor's connection attribute, which provides proxy access to the original connection. When the cursor is closed, its connection attribute is set to None. This results in an abandoned connection that will stick around until one of the following happens:

All references to the cursor are removed
The cursor goes out of scope
The connection times out
The connection is closed manually via server administration tools

You can test this by monitoring open connections (in Workbench or by using SHOW PROCESSLIST) while executing the following lines one by one:

with MySQLdb.connect(...) as my_curs:
    pass
my_curs.close()
my_curs.connection          # None
my_curs.connection.close()  # throws AttributeError, but connection still open
del my_curs                 # connection will close here

回复收藏 0 原文

走过海棠暮 2024-11-08 18:54:37

最好使用“with”关键字重写它。 “With”将自动关闭游标（这很重要，因为它是非托管资源）。好处是它也会在出现异常时关闭游标。

from contextlib import closing
import MySQLdb

''' At the beginning you open a DB connection. Particular moment when
  you open connection depends from your approach:
  - it can be inside the same function where you work with cursors
  - in the class constructor
  - etc
'''
db = MySQLdb.connect("host", "user", "pass", "database")
with closing(db.cursor()) as cur:
    cur.execute("somestuff")
    results = cur.fetchall()
    # do stuff with results

    cur.execute("insert operation")
    # call commit if you do INSERT, UPDATE or DELETE operations
    db.commit()

    cur.execute("someotherstuff")
    results2 = cur.fetchone()
    # do stuff with results2

# at some point when you decided that you do not need
# the open connection anymore you close it
db.close()

It's better to rewrite it using 'with' keyword. 'With' will take care about closing cursor (it's important because it's unmanaged resource) automatically. The benefit is it will close cursor in case of exception too.

from contextlib import closing
import MySQLdb

''' At the beginning you open a DB connection. Particular moment when
  you open connection depends from your approach:
  - it can be inside the same function where you work with cursors
  - in the class constructor
  - etc
'''
db = MySQLdb.connect("host", "user", "pass", "database")
with closing(db.cursor()) as cur:
    cur.execute("somestuff")
    results = cur.fetchall()
    # do stuff with results

    cur.execute("insert operation")
    # call commit if you do INSERT, UPDATE or DELETE operations
    db.commit()

    cur.execute("someotherstuff")
    results2 = cur.fetchone()
    # do stuff with results2

# at some point when you decided that you do not need
# the open connection anymore you close it
db.close()

回复收藏 0 原文

℡Ms空城旧梦 2024-11-08 18:54:37

^{注意：这个答案适用于 PyMySQL，它是 MySQLdb 的直接替代品，并且有效自 MySQLdb 停止维护以来 MySQLdb 的最新版本。我相信这里的一切对于旧版 MySQLdb 来说都是正确的，但还没有检查过。}

首先，一些事实：

Python 的 with 语法在执行之前调用上下文管理器的 __enter__ 方法的身体with 块，以及其后的 __exit__ 方法。
连接具有 __enter__ 方法除了创建和返回游标之外什么也不做，还有一个 __exit__提交或回滚的方法（取决于是否抛出异常）。它不会关闭连接。
PyMySQL 中的游标纯粹是 Python 中实现的抽象； MySQL 本身没有等效的概念。¹
游标有一个 __enter__ 不执行任何操作的方法和 __exit__ “关闭”游标的方法（这仅意味着将游标对其父连接的引用清空并丢弃存储在游标上的任何数据）。
游标保存对生成它们的连接的引用，但连接不保存对它们创建的游标的引用。
连接具有 __del__< /a> 关闭它们的方法
Per https://docs.python.org/3/reference/datamodel.html，CPython（默认的 Python 实现）使用引用计数，并在对象的引用数量为零时自动删除该对象。

将这些东西放在一起，我们发现像这样的简单代码理论上是有问题的：

# Problematic code, at least in theory!
import pymysql
with pymysql.connect() as cursor:
    cursor.execute('SELECT 1')

# ... happily carry on and do something unrelated

问题是没有任何东西关闭连接。事实上，如果您将上面的代码粘贴到 Python shell 中，然后在 MySQL shell 中运行 SHOW FULL PROCESSLIST，您将能够看到您创建的空闲连接。由于MySQL的默认连接数是151，这并不巨大，理论上，如果您有许多进程保持这些连接打开，您可能会开始遇到问题。

然而，在 CPython 中，有一个可取之处，可以确保像我上面的示例这样的代码可能不会导致您留下大量打开的连接。这种可取之处是，一旦cursor超出范围（例如，创建它的函数完成，或者cursor获得分配给它的另一个值），它的引用count 为零，这会导致它被删除，从而将连接的引用计数降至零，从而导致调用连接的 __del__ 方法，从而强制关闭连接。如果您已将上面的代码粘贴到 Python shell 中，那么您现在可以通过运行 cursor = '任意值'; 来模拟它一旦执行此操作，您打开的连接将从 SHOW PROCESSLIST 输出中消失。

然而，依赖于此是不优雅的，理论上在 CPython 之外的 Python 实现中可能会失败。从理论上讲，更干净的方法是显式地 .close() 连接（释放数据库上的连接，而无需等待 Python 销毁对象）。这个更健壮的代码如下所示：

import contextlib
import pymysql
with contextlib.closing(pymysql.connect()) as conn:
    with conn as cursor:
        cursor.execute('SELECT 1')

这很丑陋，但不依赖于 Python 破坏你的对象来释放你的（有限的可用数量）数据库连接。

请注意，如果您已经像这样显式关闭连接，则关闭光标是完全没有意义的。

最后，回答一下这里的次要问题：

获取新游标是否会产生大量开销，或者这没什么大不了的？

不，实例化游标根本不会影响 MySQL 并且基本上什么都不做。

查找不需要中间提交的事务集（这样您就不必为每个事务获取新游标）是否有任何显着优势？

这是具体情况而定的，很难给出一般性的答案。作为 https://dev.mysql.com/doc /refman/en/optimizing-innodb-transaction-management.html 指出，“如果应用程序每秒提交数千次，则可能会遇到性能问题；如果应用程序仅每 2 次提交，则可能会遇到不同的性能问题” -3小时”。您为每次提交付出了性能开销，但是通过让事务打开更长时间，您会增加其他连接必须花费时间等待锁定的机会，增加死锁的风险，并可能增加其他连接执行的某些查找的成本。

^{¹ MySQL 确实有一个称为光标但它们只存在于存储过程中；它们与 PyMySQL 游标完全不同，与这里无关。}

^{Note: this answer is for PyMySQL, which is a drop-in replacement for MySQLdb and effectively the latest version of MySQLdb since MySQLdb stopped being maintained. I believe everything here is also true of the legacy MySQLdb, but haven't checked.}

First of all, some facts:

Python's with syntax calls the context manager's __enter__ method before executing the body of the with block, and its __exit__ method afterwards.
Connections have an __enter__ method that does nothing besides create and return a cursor, and an __exit__ method that either commits or rolls back (depending upon whether an exception was thrown). It does not close the connection.
Cursors in PyMySQL are purely an abstraction implemented in Python; there is no equivalent concept in MySQL itself.¹
Cursors have an __enter__ method that doesn't do anything and an __exit__ method which "closes" the cursor (which just means nulling the cursor's reference to its parent connection and throwing away any data stored on the cursor).
Cursors hold a reference to the connection that spawned them, but connections don't hold a reference to the cursors that they've created.
Connections have a __del__ method which closes them
Per https://docs.python.org/3/reference/datamodel.html, CPython (the default Python implementation) uses reference counting and automatically deletes an object once the number of references to it hits zero.

Putting these things together, we see that naive code like this is in theory problematic:

# Problematic code, at least in theory!
import pymysql
with pymysql.connect() as cursor:
    cursor.execute('SELECT 1')

# ... happily carry on and do something unrelated

The problem is that nothing has closed the connection. Indeed, if you paste the code above into a Python shell and then run SHOW FULL PROCESSLIST at a MySQL shell, you'll be able to see the idle connection that you created. Since MySQL's default number of connections is 151, which isn't huge, you could theoretically start running into problems if you had many processes keeping these connections open.

However, in CPython, there is a saving grace that ensures that code like my example above probably won't cause you to leave around loads of open connections. That saving grace is that as soon as cursor goes out of scope (e.g. the function in which it was created finishes, or cursor gets another value assigned to it), its reference count hits zero, which causes it to be deleted, dropping the connection's reference count to zero, causing the connection's __del__ method to be called which force-closes the connection. If you already pasted the code above into your Python shell, then you can now simulate this by running cursor = 'arbitrary value'; as soon as you do this, the connection you opened will vanish from the SHOW PROCESSLIST output.

However, relying upon this is inelegant, and theoretically might fail in Python implementations other than CPython. Cleaner, in theory, would be to explicitly .close() the connection (to free up a connection on the database without waiting for Python to destroy the object). This more robust code looks like this:

import contextlib
import pymysql
with contextlib.closing(pymysql.connect()) as conn:
    with conn as cursor:
        cursor.execute('SELECT 1')

This is ugly, but doesn't rely upon Python destructing your objects to free up your (finite available number of) database connections.

Note that closing the cursor, if you're already closing the connection explicitly like this, is entirely pointless.

Finally, to answer the secondary questions here:

Is there a lot of overhead for getting new cursors, or is it just not a big deal?

Nope, instantiating a cursor doesn't hit MySQL at all and basically does nothing.

Is there any significant advantage to finding sets of transactions that don't require intermediate commits so that you don't have to get new cursors for each transaction?

This is situational and difficult to give a general answer to. As https://dev.mysql.com/doc/refman/en/optimizing-innodb-transaction-management.html puts it, "an application might encounter performance issues if it commits thousands of times per second, and different performance issues if it commits only every 2-3 hours". You pay a performance overhead for every commit, but by leaving transactions open for longer, you increase the chance of other connections having to spend time waiting for locks, increase your risk of deadlocks, and potentially increase the cost of some lookups performed by other connections.

^{¹ MySQL does have a construct it calls a cursor but they only exist inside stored procedures; they're completely different to PyMySQL cursors and are not relevant here.}

回复收藏 0 原文

煞人兵器 2024-11-08 18:54:37

我认为您最好尝试在所有执行中使用一个游标，并在代码末尾关闭它。它更容易使用，并且还可能具有效率优势（不要引用我的话）。

conn = MySQLdb.connect("host","user","pass","database")
cursor = conn.cursor()
cursor.execute("somestuff")
results = cursor.fetchall()
..do stuff with results
cursor.execute("someotherstuff")
results2 = cursor.fetchall()
..do stuff with results2
cursor.close()

关键是您可以将游标执行的结果存储在另一个变量中，从而释放游标以进行第二次执行。仅当您使用 fetchone() 并且需要在迭代第一个查询的所有结果之前执行第二个游标时，才会以这种方式遇到问题。

否则，我会说，一旦你完成了所有数据的获取，就关闭你的光标。这样您就不必担心稍后在代码中出现未解决的问题。

I think you'll be better off trying to use one cursor for all of your executions, and close it at the end of your code. It's easier to work with, and it might have efficiency benefits as well (don't quote me on that one).

conn = MySQLdb.connect("host","user","pass","database")
cursor = conn.cursor()
cursor.execute("somestuff")
results = cursor.fetchall()
..do stuff with results
cursor.execute("someotherstuff")
results2 = cursor.fetchall()
..do stuff with results2
cursor.close()

The point is that you can store the results of a cursor's execution in another variable, thereby freeing your cursor to make a second execution. You run into problems this way only if you're using fetchone(), and need to make a second cursor execution before you've iterated through all results from the first query.

Otherwise, I'd say just close your cursors as soon as you're done getting all of the data out of them. That way you don't have to worry about tying up loose ends later in your code.

回复收藏 0 原文