我正在构建一个 WSGI Web 应用程序,并且有一个 MySQL 数据库。我正在使用 MySQLdb,它提供用于执行语句和获取结果的游标。 获取和关闭游标的标准做法是什么?特别是,我的游标应该持续多长时间?我应该为每笔交易获取一个新的游标吗?
我相信您需要在提交连接之前关闭游标。查找不需要中间提交的事务集(这样您就不必为每个事务获取新游标)是否有任何显着的优势?获取新游标是否会产生大量开销,或者这没什么大不了的?
I'm building a WSGI web app and I have a MySQL database. I'm using MySQLdb, which provides cursors for executing statements and getting results. What is the standard practice for getting and closing cursors? In particular, how long should my cursors last? Should I get a new cursor for each transaction?
I believe you need to close the cursor before committing the connection. Is there any significant advantage to finding sets of transactions that don't require intermediate commits so that you don't have to get new cursors for each transaction? Is there a lot of overhead for getting new cursors, or is it just not a big deal?
发布评论
评论(5)
您可以尝试向模块本身寻求指导,而不是询问什么是标准实践,因为这通常是不清楚和主观的。一般来说,按照其他用户的建议使用
with
关键字是一个好主意,但在这种特定情况下,它可能无法为您提供所需的功能。从模块的 1.2.5 版本开始,MySQLdb.Connection 实现了 上下文管理器协议,包含以下代码 (github):
已经有几个关于
with
的现有问答,或者您可以阅读 理解 Python 的“with”语句,但本质上发生的是__enter__ 在
with
块的开头执行,__exit__
在离开with
块时执行。如果您打算稍后引用该对象,可以使用可选语法with EXPR as VAR
将__enter__
返回的对象绑定到名称。因此,鉴于上述实现,这是查询数据库的简单方法:现在的问题是,退出
with
块后连接和光标的状态是什么?上面显示的__exit__
方法仅调用self.rollback()
或self.commit()
,并且这些方法都不会继续调用close()
方法。游标本身没有定义__exit__
方法 - 即使有也没关系,因为with
仅管理连接。因此,退出with
块后,连接和游标都保持打开状态。通过将以下代码添加到上面的示例中可以轻松确认这一点:您应该看到打印到 stdout 的输出“光标已打开;连接已打开”。
为什么? MySQL C API,是 MySQLdb 的基础,不实现任何游标对象,如模块文档中所暗示的:"MySQL 不支持游标;但是,游标很容易被模拟。" 事实上,
MySQLdb.cursors.BaseCursor
类直接继承自object
并且在提交/回滚方面对游标没有这样的限制。一位 Oracle 开发人员这样说:我希望这与您在该主题上达到的“标准实践”一样接近。
我非常怀疑这一点,并且在尝试这样做时,您可能会引入额外的人为错误。最好决定一个惯例并坚持下去。
开销可以忽略不计,并且根本不涉及数据库服务器;它完全在 MySQLdb 的实现中。您可以 查看 github 上的
BaseCursor.__init__
如果您真的很想知道创建新游标时发生了什么。回到之前我们讨论
with
的时候,也许现在你可以理解为什么MySQLdb.Connection
类__enter__
和__exit__ 方法在每个
with
块中为您提供一个全新的光标对象,并且不必费心跟踪它或在块末尾关闭它。它相当轻量,纯粹是为了您的方便而存在。如果微观管理光标对象对您来说确实很重要,您可以使用 contextlib.close 来弥补游标对象没有定义
__exit__
方法的事实。就此而言,您还可以使用它来强制连接对象在退出with
块时自行关闭。这应该输出“my_curs is close; my_conn is close”:请注意,
with opening(arg_obj)
不会调用参数对象的__enter__
和__exit__
方法;它只会在with
块末尾调用参数对象的close
方法。 (要查看实际效果,只需定义一个类Foo
,其中包含简单的__enter__
、__exit__
和close
方法print
语句,并将执行with Foo(): pass
时发生的情况与执行with opening(Foo()): pass
时发生的情况进行比较code>。)这有两个重要的含义:首先,如果启用了自动提交模式,当您使用
with连接
并提交或回滚时,MySQLdb将在服务器上BEGIN
显式事务区块末尾的交易。这些是 MySQLdb 的默认行为,旨在保护您免受 MySQL 立即提交任何和所有 DML 语句的默认行为的影响。 MySQLdb 假设当您使用上下文管理器时,您需要一个事务,并使用显式的 BEGIN 来绕过服务器上的自动提交设置。如果您习惯于使用带连接
,您可能会认为自动提交已被禁用,而实际上它只是被绕过。如果您将结束
添加到代码中并失去事务完整性,您可能会感到不愉快;您将无法回滚更改,您可能会开始看到并发错误,但原因可能并不明显。其次,
with opening(MySQLdb.connect(user, pass)) as VAR
将连接对象绑定到VAR
,与相反使用 MySQLdb.connect(user, pass) 作为 VAR
,它将新的游标对象绑定到VAR
。在后一种情况下,您将无法直接访问连接对象!相反,您必须使用光标的connection
属性,该属性提供对原始连接的代理访问。当光标关闭时,其connection
属性设置为None
。这会导致废弃的连接将一直存在,直到发生以下情况之一:您可以通过监视打开来测试此情况连接(在 Workbench 中或通过 使用
SHOW PROCESSLIST
),同时逐一执行以下几行:
Instead of asking what is standard practice, since that's often unclear and subjective, you might try looking to the module itself for guidance. In general, using the
with
keyword as another user suggested is a great idea, but in this specific circumstance it may not give you quite the functionality you expect.As of version 1.2.5 of the module,
MySQLdb.Connection
implements the context manager protocol with the following code (github):There are several existing Q&A about
with
already, or you can read Understanding Python's "with" statement, but essentially what happens is that__enter__
executes at the start of thewith
block, and__exit__
executes upon leaving thewith
block. You can use the optional syntaxwith EXPR as VAR
to bind the object returned by__enter__
to a name if you intend to reference that object later. So, given the above implementation, here's a simple way to query your database:The question now is, what are the states of the connection and the cursor after exiting the
with
block? The__exit__
method shown above calls onlyself.rollback()
orself.commit()
, and neither of those methods go on to call theclose()
method. The cursor itself has no__exit__
method defined – and wouldn't matter if it did, becausewith
is only managing the connection. Therefore, both the connection and the cursor remain open after exiting thewith
block. This is easily confirmed by adding the following code to the above example:You should see the output "cursor is open; connection is open" printed to stdout.
Why? The MySQL C API, which is the basis for
MySQLdb
, does not implement any cursor object, as implied in the module documentation: "MySQL does not support cursors; however, cursors are easily emulated." Indeed, theMySQLdb.cursors.BaseCursor
class inherits directly fromobject
and imposes no such restriction on cursors with regard to commit/rollback. An Oracle developer had this to say:I expect that's as close as you're going to get to "standard practice" on this subject.
I very much doubt it, and in trying to do so, you may introduce additional human error. Better to decide on a convention and stick with it.
The overhead is negligible, and doesn't touch the database server at all; it's entirely within the implementation of MySQLdb. You can look at
BaseCursor.__init__
on github if you're really curious to know what's happening when you create a new cursor.Going back to earlier when we were discussing
with
, perhaps now you can understand why theMySQLdb.Connection
class__enter__
and__exit__
methods give you a brand new cursor object in everywith
block and don't bother keeping track of it or closing it at the end of the block. It's fairly lightweight and exists purely for your convenience.If it's really that important to you to micromanage the cursor object, you can use contextlib.closing to make up for the fact that the cursor object has no defined
__exit__
method. For that matter, you can also use it to force the connection object to close itself upon exiting awith
block. This should output "my_curs is closed; my_conn is closed":Note that
with closing(arg_obj)
will not call the argument object's__enter__
and__exit__
methods; it will only call the argument object'sclose
method at the end of thewith
block. (To see this in action, simply define a classFoo
with__enter__
,__exit__
, andclose
methods containing simpleprint
statements, and compare what happens when you dowith Foo(): pass
to what happens when you dowith closing(Foo()): pass
.) This has two significant implications:First, if autocommit mode is enabled, MySQLdb will
BEGIN
an explicit transaction on the server when you usewith connection
and commit or rollback the transaction at the end of the block. These are default behaviors of MySQLdb, intended to protect you from MySQL's default behavior of immediately committing any and all DML statements. MySQLdb assumes that when you use a context manager, you want a transaction, and uses the explicitBEGIN
to bypass the autocommit setting on the server. If you're used to usingwith connection
, you might think autocommit is disabled when actually it was only being bypassed. You might get an unpleasant surprise if you addclosing
to your code and lose transactional integrity; you won't be able to rollback changes, you may start seeing concurrency bugs and it may not be immediately obvious why.Second,
with closing(MySQLdb.connect(user, pass)) as VAR
binds the connection object toVAR
, in contrast towith MySQLdb.connect(user, pass) as VAR
, which binds a new cursor object toVAR
. In the latter case you would have no direct access to the connection object! Instead, you would have to use the cursor'sconnection
attribute, which provides proxy access to the original connection. When the cursor is closed, itsconnection
attribute is set toNone
. This results in an abandoned connection that will stick around until one of the following happens:You can test this by monitoring open connections (in Workbench or by using
SHOW PROCESSLIST
) while executing the following lines one by one:最好使用“with”关键字重写它。 “With”将自动关闭游标(这很重要,因为它是非托管资源)。好处是它也会在出现异常时关闭游标。
It's better to rewrite it using 'with' keyword. 'With' will take care about closing cursor (it's important because it's unmanaged resource) automatically. The benefit is it will close cursor in case of exception too.
注意:这个答案适用于 PyMySQL,它是 MySQLdb 的直接替代品,并且有效自 MySQLdb 停止维护以来 MySQLdb 的最新版本。我相信这里的一切对于旧版 MySQLdb 来说都是正确的,但还没有检查过。
首先,一些事实:
with
语法在执行之前调用上下文管理器的__enter__
方法的身体with
块,以及其后的__exit__
方法。__enter__
方法除了创建和返回游标之外什么也不做,还有一个__exit__
提交或回滚的方法(取决于是否抛出异常)。它不会关闭连接。__enter__
不执行任何操作的方法和__exit__
“关闭”游标的方法(这仅意味着将游标对其父连接的引用清空并丢弃存储在游标上的任何数据)。__del__
< /a> 关闭它们的方法将这些东西放在一起,我们发现像这样的简单代码理论上是有问题的:
问题是没有任何东西关闭连接。事实上,如果您将上面的代码粘贴到 Python shell 中,然后在 MySQL shell 中运行
SHOW FULL PROCESSLIST
,您将能够看到您创建的空闲连接。由于MySQL的默认连接数是151,这并不巨大,理论上,如果您有许多进程保持这些连接打开,您可能会开始遇到问题。然而,在 CPython 中,有一个可取之处,可以确保像我上面的示例这样的代码可能不会导致您留下大量打开的连接。这种可取之处是,一旦
cursor
超出范围(例如,创建它的函数完成,或者cursor
获得分配给它的另一个值),它的引用count 为零,这会导致它被删除,从而将连接的引用计数降至零,从而导致调用连接的 __del__ 方法,从而强制关闭连接。如果您已将上面的代码粘贴到 Python shell 中,那么您现在可以通过运行cursor = '任意值'
; 来模拟它一旦执行此操作,您打开的连接将从SHOW PROCESSLIST
输出中消失。然而,依赖于此是不优雅的,理论上在 CPython 之外的 Python 实现中可能会失败。从理论上讲,更干净的方法是显式地
.close()
连接(释放数据库上的连接,而无需等待 Python 销毁对象)。这个更健壮的代码如下所示:这很丑陋,但不依赖于 Python 破坏你的对象来释放你的(有限的可用数量)数据库连接。
请注意,如果您已经像这样显式关闭连接,则关闭光标是完全没有意义的。
最后,回答一下这里的次要问题:
不,实例化游标根本不会影响 MySQL 并且 基本上什么都不做。
这是具体情况而定的,很难给出一般性的答案。作为 https://dev.mysql.com/doc /refman/en/optimizing-innodb-transaction-management.html 指出,“如果应用程序每秒提交数千次,则可能会遇到性能问题;如果应用程序仅每 2 次提交,则可能会遇到不同的性能问题” -3小时”。您为每次提交付出了性能开销,但是通过让事务打开更长时间,您会增加其他连接必须花费时间等待锁定的机会,增加死锁的风险,并可能增加其他连接执行的某些查找的成本。
1 MySQL 确实有一个称为 光标 但它们只存在于存储过程中;它们与 PyMySQL 游标完全不同,与这里无关。
Note: this answer is for PyMySQL, which is a drop-in replacement for MySQLdb and effectively the latest version of MySQLdb since MySQLdb stopped being maintained. I believe everything here is also true of the legacy MySQLdb, but haven't checked.
First of all, some facts:
with
syntax calls the context manager's__enter__
method before executing the body of thewith
block, and its__exit__
method afterwards.__enter__
method that does nothing besides create and return a cursor, and an__exit__
method that either commits or rolls back (depending upon whether an exception was thrown). It does not close the connection.__enter__
method that doesn't do anything and an__exit__
method which "closes" the cursor (which just means nulling the cursor's reference to its parent connection and throwing away any data stored on the cursor).__del__
method which closes themPutting these things together, we see that naive code like this is in theory problematic:
The problem is that nothing has closed the connection. Indeed, if you paste the code above into a Python shell and then run
SHOW FULL PROCESSLIST
at a MySQL shell, you'll be able to see the idle connection that you created. Since MySQL's default number of connections is 151, which isn't huge, you could theoretically start running into problems if you had many processes keeping these connections open.However, in CPython, there is a saving grace that ensures that code like my example above probably won't cause you to leave around loads of open connections. That saving grace is that as soon as
cursor
goes out of scope (e.g. the function in which it was created finishes, orcursor
gets another value assigned to it), its reference count hits zero, which causes it to be deleted, dropping the connection's reference count to zero, causing the connection's__del__
method to be called which force-closes the connection. If you already pasted the code above into your Python shell, then you can now simulate this by runningcursor = 'arbitrary value'
; as soon as you do this, the connection you opened will vanish from theSHOW PROCESSLIST
output.However, relying upon this is inelegant, and theoretically might fail in Python implementations other than CPython. Cleaner, in theory, would be to explicitly
.close()
the connection (to free up a connection on the database without waiting for Python to destroy the object). This more robust code looks like this:This is ugly, but doesn't rely upon Python destructing your objects to free up your (finite available number of) database connections.
Note that closing the cursor, if you're already closing the connection explicitly like this, is entirely pointless.
Finally, to answer the secondary questions here:
Nope, instantiating a cursor doesn't hit MySQL at all and basically does nothing.
This is situational and difficult to give a general answer to. As https://dev.mysql.com/doc/refman/en/optimizing-innodb-transaction-management.html puts it, "an application might encounter performance issues if it commits thousands of times per second, and different performance issues if it commits only every 2-3 hours". You pay a performance overhead for every commit, but by leaving transactions open for longer, you increase the chance of other connections having to spend time waiting for locks, increase your risk of deadlocks, and potentially increase the cost of some lookups performed by other connections.
1 MySQL does have a construct it calls a cursor but they only exist inside stored procedures; they're completely different to PyMySQL cursors and are not relevant here.
我认为您最好尝试在所有执行中使用一个游标,并在代码末尾关闭它。它更容易使用,并且还可能具有效率优势(不要引用我的话)。
关键是您可以将游标执行的结果存储在另一个变量中,从而释放游标以进行第二次执行。仅当您使用 fetchone() 并且需要在迭代第一个查询的所有结果之前执行第二个游标时,才会以这种方式遇到问题。
否则,我会说,一旦你完成了所有数据的获取,就关闭你的光标。这样您就不必担心稍后在代码中出现未解决的问题。
I think you'll be better off trying to use one cursor for all of your executions, and close it at the end of your code. It's easier to work with, and it might have efficiency benefits as well (don't quote me on that one).
The point is that you can store the results of a cursor's execution in another variable, thereby freeing your cursor to make a second execution. You run into problems this way only if you're using fetchone(), and need to make a second cursor execution before you've iterated through all results from the first query.
Otherwise, I'd say just close your cursors as soon as you're done getting all of the data out of them. That way you don't have to worry about tying up loose ends later in your code.
我建议像php和mysql那样做。在打印第一个数据之前,在代码开头启动 i 。因此,如果您收到连接错误,您可以显示
50x
(不记得什么是内部错误)错误消息。并在整个会话期间保持其打开状态,并在您知道不再需要它时将其关闭。I suggest to do it like php and mysql. Start i at the beginning of your code before printing of the first data. So if you get a connect error you can display a
50x
(Don't remember what internal error is) error message. And keep it open for the whole session and close it when you know you wont need it anymore.