使用 python 多处理时,mysql 得到不同的结果
我不知道我做错了什么(或者如何纠正它)。显示一些代码可能会更容易(它比我正在做的事情有点简化,但它证明了我的观点):
from multiprocessing import Pool
import MySQLdb
import sys
#sql connection
try:
conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit (1)
#with database
cursor = conn.cursor ()
cursor.execute ("DROP TABLE IF EXISTS data_table")
cursor.execute ("""
CREATE TABLE data_table(
value CHAR(80)
) ENGINE=MyISAM
""")
cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [0, 0]) #need to insert basecase
conn.commit()
def build_table(i,x): # i is index, x is data[i]
conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
cursor = conn.cursor ()
#print i,x
target_sum = 100
for s in range(target_sum + 1):
for c in range(target_sum + 1):
#print c, i
cursor.execute ("""
INSERT INTO data_table (value)
SELECT '%s'
FROM dual
WHERE ( SELECT COUNT(*) FROM data_table WHERE value='%s' )
= 1
AND NOT EXISTS
( SELECT * FROM data_table WHERE value='%s' )
""" % ([s, i+1], [s - c * x, i], [s, i+1]))
conn.commit()
conn.close()
data = [2,5,8]
pool = Pool(processes=4)
for i, x in enumerate(data):
build_table(i,x) #creates 250 records
#pool.apply_async(build_table, (i, x))
pool.close()
pool.join()
print 'completed'
它基本上在 mysql 中创建了一个表。上面的代码创建了 250 个条目(这是正确的),但是如果您在 for 循环中注释掉 build_table(i,x)
并取消注释 pool.apply_async(build_table, (i, x) )
它只创建了 52 条记录。为什么在多重处理同一函数时会出现差异,我可以做些什么来修复它,以便结果是相同的(我认为快速提交更新可以修复它,但没有运气)?
如果我玩 pool = Pool(processes=4)
并将其更改为 1,它会起作用,但我想这是预期的,因为此时它实际上并不是多处理。另外,如果有帮助的话我正在使用 InnoDB。
更新:当我更改为 MyISAM 时,我得到了 240 个正在更新的结果(不完全是我需要的 250 个,但比 52 个好得多)。
UPDATE2:mysql 命令被合并为单个命令,结果似乎有所不同。有时我会在数据库中得到 248 个结果,有时会得到 240 个或更少。也许多重处理导致了预期结果和实际结果之间的这种差异?
I can't figure out what I'm doing wrong(or how to correct it). It might be easier to show some code(its a bit simplified from what I'm doing but it proves my point):
from multiprocessing import Pool
import MySQLdb
import sys
#sql connection
try:
conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit (1)
#with database
cursor = conn.cursor ()
cursor.execute ("DROP TABLE IF EXISTS data_table")
cursor.execute ("""
CREATE TABLE data_table(
value CHAR(80)
) ENGINE=MyISAM
""")
cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [0, 0]) #need to insert basecase
conn.commit()
def build_table(i,x): # i is index, x is data[i]
conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
cursor = conn.cursor ()
#print i,x
target_sum = 100
for s in range(target_sum + 1):
for c in range(target_sum + 1):
#print c, i
cursor.execute ("""
INSERT INTO data_table (value)
SELECT '%s'
FROM dual
WHERE ( SELECT COUNT(*) FROM data_table WHERE value='%s' )
= 1
AND NOT EXISTS
( SELECT * FROM data_table WHERE value='%s' )
""" % ([s, i+1], [s - c * x, i], [s, i+1]))
conn.commit()
conn.close()
data = [2,5,8]
pool = Pool(processes=4)
for i, x in enumerate(data):
build_table(i,x) #creates 250 records
#pool.apply_async(build_table, (i, x))
pool.close()
pool.join()
print 'completed'
It basically creates a table in mysql. The code above creates 250 entries(which is correct), but if you comment out build_table(i,x)
in the for loop and uncomment pool.apply_async(build_table, (i, x))
it creates only 52 records. Why is there a difference when multiprocessing the same function and is there anything I can do to fix it so the results are the same(I thought quickly committing updates would fix it but no luck)?
If I play around pool = Pool(processes=4)
and change it to 1, it works but I guess thats expected because its not multiprocessing really at that point. Also, if it helps I'm using InnoDB.
UPDATE: when I change to MyISAM I get 240 results being updated(not quite the 250 I need but much better than 52).
UPDATE2: mysql command was combined into a single command, and results seem to vary. Sometimes I get 248 results in the database, sometimes 240 or less. Maybe multiprocessing is causing this diverge between expected and actual results ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我会尝试将 2 个 Selects 和 Insert 合并到一个 Insert 语句中:
变成类似:
不确定最后一行中的语法。您需要传递 3 个参数。
I would try to combine the 2 Selects and the Insert in one Insert statement:
Into something like:
Not sure about the syntax in the last line. You'll need to pass 3 parameters.