python循环打开EXCEL文档，向mysql写入数据，开始很快，后来越来越慢怎么处理？

发布于 2022-09-13 01:03:02 字数 3216 浏览 28 评论 0

我这里有个需求，是读取30个表格的内容，然后写入数据库，我使用的是for循环打开这些表格，如果使用for循环的话，开始很快，后来会越来越慢。
单次运行的话速度也很快，读取一个表格写入数据库的时间大概在200s左右。请问一下，这个是什么问题？如何解决？
mysql中的d建立了索引

import pymysql
from openpyxl import load_workbook
def read_excel(p):
    # p = '26'
    db = pymysql.Connect(host="localhost",port=3306,user="test",passwd="123456",db="test",charset="utf8")    
    cur = db.cursor()                
    excel = r'E:\整理数据\4月\1 ({}).xlsx'.format(p)
    wb = load_workbook(excel)
    ws = wb.active
    rows = ws.max_row
    #print(rows)
    for i in range(2,rows):
        d = ws['G%s' %i].value
        fb = ws['F%s' %i].value
        kh = ws['BB%s' %i].value
        wdzl = ws['AV%s' %i].value
        zxzl = ws['AW%s' %i].value
        jpzl = ws['AU%s' %i].value
        jszl = ws['AT%s' %i].value
        jpinfo = ws['AX%s' %i].value
        ywtime = ws['J%s' %i].value
        try:
 
            sql = "select dh from testdata where d = '{}'".format(d)
            cur.execute(sql)
            tid = cur.fetchone()
            if tid == None:
                print('新数据')
                # 向数据库写入新数据
                input_sql = "insert into testdata(d,fb,kh,wdzl,zxzl,jpzl,jszl,jpinfo,ywtime) values('{}','{}','{}','{}','{}','{}','{}','{}','{}')".format(d,fb,kh,wdzl,zxzl,jpzl,jszl,jpinfo,ywtime)
                cur.execute(input_sql)
                db.commit()
                print(i,d,fb,wdzl,zxzl,jpzl,jszl,jpinfo,ywtime)
            else:
                print('数据已存在')
                sql1 = "select wdzl,zxzl,jszl,jpzl,jpinfo from testdata where d = '{}'".format(d)
                cur.execute(sql1)
                w = cur.fetchone()
                print(w)
                    
                if wdzl != '0' and w[0] == '0':
                    up_sql = "update testdata set wdzl = {} where d = '{}'".format(wdzl,d)
                    cur.execute(up_sql)
                    db.commit()
                else:
                    pass
                if zxzl != '0' and w[1] == '0':
                    up_sql = "update testdata set zxzl = '{}' where d = '{}'".format(zxzl,d)
                    cur.execute(up_sql)
                    db.commit()
                else:
                    pass
                if jszl != '0' and w[2] == '0':
                    up_sql = "update testdata set jszl = '{}' where d = '{}'".format(jszl,d)
                    cur.execute(up_sql)
                    db.commit()
                else:
                    pass
                if jpzl != '0' and w[3] == '0':
                    up_sql = "update testdata set jpzl = '{}' where d = '{}'".format(jpzl,d)
                    cur.execute(up_sql)
                    db.commit()
                else:
                    pass
                if jpinfo != '0' and w[4] == '0':
                    up_sql = "update testdata set jpinfo = '{}' where d = '{}'".format(jpinfo,d)
                    cur.execute(up_sql)
                    db.commit()
                else:
                    pass
            print(d)
        except Exception as e:
            print(e,d)
    print('-'*50)
    print(p,'号数据已完成')
    cur.close()
    db.close()
 
# for i in range(4,31):
#     read_excel(i)
read_excel('17')

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

回心转意 2022-09-20 01:03:02

不要print，这个速度慢可能是因为print

回复收藏 0

一直在等你来 2022-09-20 01:03:02

openpyxl似乎不会主动释放内存，每个文件读取完后，手动释放内存试一下：

import gc

...

del wb,ws
gc.collect()

回复收藏 0

呆 2022-09-20 01:03:02

推测的话，openpyxl的根据行号列号读取的时候，是从第一行第一列开始遍历，直到行号等于指定行号，列号等于指定列号，所以要读取的行号列号越多就越慢，（也可能是从第一个有数据的行或列），而xlrd则是类似与数组一样，我们要取第几个元素，直接根据下标找到内存中对应地址的元素即可，所以无论excel总量多少，速度基本都是不变的。

楼上说的也不错，可以试试。

回复收藏 0

~没有更多了~