使用 python 多处理时,mysql 得到不同的结果

发布于 2025-01-06 17:38:47 字数 2054 浏览 0 评论 0原文

我不知道我做错了什么(或者如何纠正它)。显示一些代码可能会更容易(它比我正在做的事情有点简化,但它证明了我的观点):

from multiprocessing import Pool
import MySQLdb
import sys

#sql connection
try:
    conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
except MySQLdb.Error, e:
     print "Error %d: %s" % (e.args[0], e.args[1])
     sys.exit (1)

#with database
cursor = conn.cursor ()
cursor.execute ("DROP TABLE IF EXISTS data_table")
cursor.execute ("""
    CREATE TABLE data_table(
        value     CHAR(80)
    ) ENGINE=MyISAM 
""")

cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [0, 0]) #need to insert basecase
conn.commit()


def build_table(i,x):    # i is index, x is data[i]
    conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
    cursor = conn.cursor ()
    #print i,x
    target_sum = 100
    for s in range(target_sum + 1): 
        for c in range(target_sum + 1): 
            #print c, i
            cursor.execute ("""
INSERT INTO data_table (value) 
   SELECT  '%s'
   FROM dual
   WHERE ( SELECT COUNT(*) FROM data_table WHERE value='%s' )
         = 1
     AND NOT EXISTS
         ( SELECT * FROM data_table WHERE value='%s' )
           """ % ([s, i+1], [s - c * x, i], [s, i+1]))
            conn.commit()

    conn.close()

data = [2,5,8]
pool = Pool(processes=4)
for i, x in enumerate(data): 
    build_table(i,x) #creates 250 records
    #pool.apply_async(build_table, (i, x))
pool.close()
pool.join()

print 'completed'

它基本上在 mysql 中创建了一个表。上面的代码创建了 250 个条目(这是正确的),但是如果您在 for 循环中注释掉 build_table(i,x) 并取消注释 pool.apply_async(build_table, (i, x) ) 它只创建了 52 条记录。为什么在多重处理同一函数时会出现差异,我可以做些什么来修复它,以便结果是相同的(我认为快速提交更新可以修复它,但没有运气)?

如果我玩 pool = Pool(processes=4) 并将其更改为 1,它会起作用,但我想这是预期的,因为此时它实际上并不是多处理。另外,如果有帮助的话我正在使用 InnoDB。

更新:当我更改为 MyISAM 时,我得到了 240 个正在更新的结果(不完全是我需要的 250 个,但比 52 个好得多)。

UPDATE2:mysql 命令被合并为单个命令,结果似乎有所不同。有时我会在数据库中得到 248 个结果,有时会得到 240 个或更少。也许多重处理导致了预期结果和实际结果之间的这种差异?

I can't figure out what I'm doing wrong(or how to correct it). It might be easier to show some code(its a bit simplified from what I'm doing but it proves my point):

from multiprocessing import Pool
import MySQLdb
import sys

#sql connection
try:
    conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
except MySQLdb.Error, e:
     print "Error %d: %s" % (e.args[0], e.args[1])
     sys.exit (1)

#with database
cursor = conn.cursor ()
cursor.execute ("DROP TABLE IF EXISTS data_table")
cursor.execute ("""
    CREATE TABLE data_table(
        value     CHAR(80)
    ) ENGINE=MyISAM 
""")

cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [0, 0]) #need to insert basecase
conn.commit()


def build_table(i,x):    # i is index, x is data[i]
    conn = MySQLdb.connect (host = "127.0.0.1",user = "user",passwd = "password", db = "mydb")
    cursor = conn.cursor ()
    #print i,x
    target_sum = 100
    for s in range(target_sum + 1): 
        for c in range(target_sum + 1): 
            #print c, i
            cursor.execute ("""
INSERT INTO data_table (value) 
   SELECT  '%s'
   FROM dual
   WHERE ( SELECT COUNT(*) FROM data_table WHERE value='%s' )
         = 1
     AND NOT EXISTS
         ( SELECT * FROM data_table WHERE value='%s' )
           """ % ([s, i+1], [s - c * x, i], [s, i+1]))
            conn.commit()

    conn.close()

data = [2,5,8]
pool = Pool(processes=4)
for i, x in enumerate(data): 
    build_table(i,x) #creates 250 records
    #pool.apply_async(build_table, (i, x))
pool.close()
pool.join()

print 'completed'

It basically creates a table in mysql. The code above creates 250 entries(which is correct), but if you comment out build_table(i,x) in the for loop and uncomment pool.apply_async(build_table, (i, x)) it creates only 52 records. Why is there a difference when multiprocessing the same function and is there anything I can do to fix it so the results are the same(I thought quickly committing updates would fix it but no luck)?

If I play around pool = Pool(processes=4) and change it to 1, it works but I guess thats expected because its not multiprocessing really at that point. Also, if it helps I'm using InnoDB.

UPDATE: when I change to MyISAM I get 240 results being updated(not quite the 250 I need but much better than 52).

UPDATE2: mysql command was combined into a single command, and results seem to vary. Sometimes I get 248 results in the database, sometimes 240 or less. Maybe multiprocessing is causing this diverge between expected and actual results ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

手长情犹 2025-01-13 17:38:47

我会尝试将 2 个 Selects 和 Insert 合并到一个 Insert 语句中:

#print c, i
cursor.execute(""" SELECT value FROM data_table WHERE value='%s' """ % ([s - c * x, i]))
if cursor.rowcount == 1:
    cursor.execute(""" SELECT value FROM data_table WHERE value='%s' """ % [s, i+1])
    if cursor.rowcount == 0:
        cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [s, i+1])

变成类似:

#print c, i
cursor.execute ("""
    INSERT INTO data_table (value) 
       SELECT  '%s'
       FROM dual
       WHERE ( SELECT COUNT(*) FROM data_table WHERE value='%s' )
             = 1
         AND NOT EXISTS
             ( SELECT * FROM data_table WHERE value='%s' )
               """ % ([s, i+1], [s - c * x, i], [s, i+1]))

不确定最后一行中的语法。您需要传递 3 个参数。

I would try to combine the 2 Selects and the Insert in one Insert statement:

#print c, i
cursor.execute(""" SELECT value FROM data_table WHERE value='%s' """ % ([s - c * x, i]))
if cursor.rowcount == 1:
    cursor.execute(""" SELECT value FROM data_table WHERE value='%s' """ % [s, i+1])
    if cursor.rowcount == 0:
        cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [s, i+1])

Into something like:

#print c, i
cursor.execute ("""
    INSERT INTO data_table (value) 
       SELECT  '%s'
       FROM dual
       WHERE ( SELECT COUNT(*) FROM data_table WHERE value='%s' )
             = 1
         AND NOT EXISTS
             ( SELECT * FROM data_table WHERE value='%s' )
               """ % ([s, i+1], [s - c * x, i], [s, i+1]))

Not sure about the syntax in the last line. You'll need to pass 3 parameters.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文