为什么 Python 中的线程 MySQLdb 查询比相同的非线程查询慢?
我正在构建一个线程类来使用 Python 和 MySQLdb 运行 MySQL 查询。我不明白为什么线程运行这些查询比非线程运行它们慢。这是我的代码来展示我在做什么。
首先,这是非线程函数。
def testQueryDo(query_list):
db = MySQLdb.connect('localhost', 'user', 'pass', 'db_name')
cursor = db.cursor()
q_list = query_list
for each in q_list:
cursor.execute(each)
results = cursor.fetchall()
db.close()
这是我的线程类:
class queryThread(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
self.db = MySQLdb.connect('localhost', 'user', 'pass', 'db_name')
self.cursor = self.db.cursor()
def run(self):
cur_query = self.queue.get()
self.cursor.execute(cur_query)
results = self.cursor.fetchall()
self.db.close()
self.queue.task_done()
这是处理程序:
def queryHandler(query_list):
queue = Queue.Queue()
for query in query_list:
queue.put(query)
total_queries = len(query_list)
for query in range(total_queries):
t = queryThread(queue)
t.setDaemon(True)
t.start()
queue.join()
我不确定为什么这个线程代码运行速度较慢。有趣的是,如果我使用相同的代码,只做一些简单的事情,比如数字相加,线程代码会显着更快。
我知道我一定错过了一些完全明显的东西,但是任何支持将不胜感激!
I am building a threaded class to run MySQL queries using Python and MySQLdb. I don't understand why running these queries threaded is slower than running them non-threaded. Here's my code to show what I'm doing.
First, here's the non-threaded function.
def testQueryDo(query_list):
db = MySQLdb.connect('localhost', 'user', 'pass', 'db_name')
cursor = db.cursor()
q_list = query_list
for each in q_list:
cursor.execute(each)
results = cursor.fetchall()
db.close()
Here's my threaded class:
class queryThread(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
self.db = MySQLdb.connect('localhost', 'user', 'pass', 'db_name')
self.cursor = self.db.cursor()
def run(self):
cur_query = self.queue.get()
self.cursor.execute(cur_query)
results = self.cursor.fetchall()
self.db.close()
self.queue.task_done()
And here's the handler:
def queryHandler(query_list):
queue = Queue.Queue()
for query in query_list:
queue.put(query)
total_queries = len(query_list)
for query in range(total_queries):
t = queryThread(queue)
t.setDaemon(True)
t.start()
queue.join()
I'm not sure why this threaded code is running slower. What's interesting is that if I use the same code, only do something simple like addition of numbers, the threaded code is significantly faster.
I understand that I must be missing something completely obvious, however any support would be much appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您正在启动 N 个线程,每个线程都创建自己的 MySQL 连接,并且您正在使用同步队列将查询传递给线程。每个线程都在
queue.get()
上阻塞(获取独占锁)以获取查询,然后创建到数据库的连接,然后调用task_done()
让下一个线程继续进行。因此,当线程 1 工作时,N-1 个线程什么也不做。锁获取/释放的开销,加上连续创建和关闭多个数据库连接的额外开销加起来。You're starting N threads, each of which creates its own connection to MySQL, and you're using a synchronous queue to deliver the queries to the threads. Each thread is blocking on
queue.get()
(acquiring an exclusive lock) to get a query, then creating a connection to the database, and then callingtask_done()
which lets the next thread proceed. So while thread 1 is working, N-1 threads are doing nothing. This overhead of lock acquire/release, plus the addditional overhead of serially creating and closing several connections to the database adds up.