使用 MySQLdb 的嵌套查询
我正在尝试使用 Python 和 MySQLdb 接口实现以下目标:
- 读取具有几百万行的表的内容。
- 处理并修改每一行的输出。
- 将修改后的行放入另一个表中。
对我来说,迭代每一行、即时处理然后将每个新行即时插入到新表中似乎是明智的。
这是可行的:
import MySQLdb
import MySQLdb.cursors
conn=MySQLdb.connect(
host="somehost",user="someuser",
passwd="somepassword",db="somedb")
cursor1 = conn.cursor(MySQLdb.cursors.Cursor)
query1 = "SELECT * FROM table1"
cursor1.execute(query1)
cursor2 = conn.cursor(MySQLdb.cursors.Cursor)
for row in cursor1:
values = some_function(row)
query2 = "INSERT INTO table2 VALUES (%s, %s, %s)"
cursor2.execute(query2, values)
cursor2.close()
cursor1.close()
conn.commit()
conn.close()
但这很慢并且消耗内存,因为它使用客户端游标进行 SELECT 查询。如果我改为使用服务器端游标进行 SELECT
查询:
cursor1 = conn.cursor(MySQLdb.cursors.SSCursor)
然后我会收到 2014 错误:
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method SSCursor.__del__ of <MySQLdb.cursors.SSCursor object at 0x925d6ec>> ignored
因此,它似乎不喜欢在迭代服务器端游标时启动另一个游标。这似乎让我陷入了一个非常慢的客户端迭代器的困境。
有什么建议吗?
I am trying to achieve the following using Python and the MySQLdb interface:
- Read the contents of a table that has a few million rows.
- Process and modify the output of every row.
- Put the modified rows into another table.
It seems sensible to me to iterate over each row, process on-the-fly and then insert each new row into the new table on-the-fly.
This works:
import MySQLdb
import MySQLdb.cursors
conn=MySQLdb.connect(
host="somehost",user="someuser",
passwd="somepassword",db="somedb")
cursor1 = conn.cursor(MySQLdb.cursors.Cursor)
query1 = "SELECT * FROM table1"
cursor1.execute(query1)
cursor2 = conn.cursor(MySQLdb.cursors.Cursor)
for row in cursor1:
values = some_function(row)
query2 = "INSERT INTO table2 VALUES (%s, %s, %s)"
cursor2.execute(query2, values)
cursor2.close()
cursor1.close()
conn.commit()
conn.close()
But this is slow and memory-consuming since it's using a client-side cursor for the SELECT
query. If I instead use a server-side cursor for the SELECT
query:
cursor1 = conn.cursor(MySQLdb.cursors.SSCursor)
Then I get a 2014 error:
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method SSCursor.__del__ of <MySQLdb.cursors.SSCursor object at 0x925d6ec>> ignored
So it doesn't seem to like starting another cursor while iterating over a server-side cursor. Which seems to leave me stuck with a very slow client-side iterator.
Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要与数据库的单独连接,因为第一个连接被困在流式传输结果集上,因此您无法运行插入查询。
试试这个:
You need a seperate connection to the database, since the first connection is stuck with streaming the resultset, you can't run the insert query.
Try this: