Python 的数据库查询速度较慢,但 Perl 则不然
我的网上商店使用 python (Django)。
当我测试高负载(数据库访问)时得到了有趣的结果:
python 10 process = 200sec / 100% CPU utilisation
perl 10 process = 65sec / 35% CPU utilisation
Centos 6、python 2.6、mysql 5.5、标准库、其他服务器上的 mysql-server。 表product_cars 有 70 000 000 条记录。
为什么 python 程序这么慢?
Python 程序:
#!/usr/bin/python
import MySQLdb
import re
from MySQLdb import cursors
import shutil
import datetime
import random
db0 = MySQLdb.connect(user="X", passwd="X", db="parts")
cursor0 = db0.cursor()
cursor0.execute('SET NAMES utf8')
now = datetime.datetime.now()
for x in xrange(1, 100000):
id = random.randint(10, 50000)
cursor0.execute("SELECT * FROM product_cars WHERE car_id=%s LIMIT 500", [id])
cursor0.fetchone()
Perl 程序:
#!/usr/bin/perl
use DBI;
my $INSTANCE=$ARGV[0];
my $user = "x";
my $pw = "x";
my $db = DBI->connect( "dbi:mysql:parts", "x", "x");
my $sql= "SELECT * FROM product_cars WHERE car_id=? LIMIT 500";
foreach $_ ( 1 .. 100000 )
{
$random = int(rand(50000));
$cursor = $db->prepare($sql);
$cursor->execute($random) || die $cursor->errstr;
@Data= $cursor->fetchrow_array();
}
$cursor->finish;
$db->disconnect;
update1
有趣的事情:
总是选择 id=1 的行:< /strong>
清楚 MYSQL 使用缓存和查询会非常快,但同样很慢并且 CPU 利用率为 100%。但相同的 Perl 或 ruby 代码运行速度很快。
如果替换 python 代码中的字符串:
# remove "SET NAMES utf8" string - this has no impact
# python-mysql use "%s", but not "?" as parameter marker
id = 1
for x in xrange(1, 100000):
id = 1
cursor0.execute("SELECT * FROM product_cars WHERE car_id=%s LIMIT 500", [id])
cursor0.fetchone()
perl 中的相同代码:
foreach $_ ( 1 .. 20000 )
{
$cursor = $db->prepare( "SELECT * FROM product_cars WHERE car_id=? LIMIT 500";);
$cursor->execute(1);
# while (my @Data= $cursor->fetchrow_array())
if ($_ % 1000 == 0) { print "$_\n" };.
@Data= $cursor->fetchrow_array();
# print "$_\n";
}
ruby 中的代码:
pk=2
20000.times do |i|
if i % 1000 == 0
print i, "\n"
end
res = my.query("SELECT * FROM product_cars WHERE car_id='#{pk}' LIMIT 500")
res.fetch_row
end
更新 2
Exec SQL "SELECT * FROM product WHERE id=1" (string without params) 100000 times
Python: ~15 sec 100% CPU 100%
Perl: ~9 sec CPU 70-90%
Ruby: ~6 sec CPU 60-80%
其他计算机上的 MySQL 服务器。
更新 3
尝试使用 oursql 和 pymysql - 结果更差。
I use python (Django) for my web-shop.
When I tested high loading (db access) got interesting results:
python 10 process = 200sec / 100% CPU utilisation
perl 10 process = 65sec / 35% CPU utilisation
Centos 6, python 2.6, mysql 5.5, standard libraries, mysql-server on other server.
Table product_cars have 70 000 000 records.
Why python-program so slow?
Python program:
#!/usr/bin/python
import MySQLdb
import re
from MySQLdb import cursors
import shutil
import datetime
import random
db0 = MySQLdb.connect(user="X", passwd="X", db="parts")
cursor0 = db0.cursor()
cursor0.execute('SET NAMES utf8')
now = datetime.datetime.now()
for x in xrange(1, 100000):
id = random.randint(10, 50000)
cursor0.execute("SELECT * FROM product_cars WHERE car_id=%s LIMIT 500", [id])
cursor0.fetchone()
Perl program:
#!/usr/bin/perl
use DBI;
my $INSTANCE=$ARGV[0];
my $user = "x";
my $pw = "x";
my $db = DBI->connect( "dbi:mysql:parts", "x", "x");
my $sql= "SELECT * FROM product_cars WHERE car_id=? LIMIT 500";
foreach $_ ( 1 .. 100000 )
{
$random = int(rand(50000));
$cursor = $db->prepare($sql);
$cursor->execute($random) || die $cursor->errstr;
@Data= $cursor->fetchrow_array();
}
$cursor->finish;
$db->disconnect;
update1
Interesting thing:
select always row with id=1:
Сlear that MYSQL use cache and query will be very fast, but again slow and 100% CPU utilisation. But same perl or ruby code work quick.
if replace string in python code:
# remove "SET NAMES utf8" string - this has no impact
# python-mysql use "%s", but not "?" as parameter marker
id = 1
for x in xrange(1, 100000):
id = 1
cursor0.execute("SELECT * FROM product_cars WHERE car_id=%s LIMIT 500", [id])
cursor0.fetchone()
Same code in perl:
foreach $_ ( 1 .. 20000 )
{
$cursor = $db->prepare( "SELECT * FROM product_cars WHERE car_id=? LIMIT 500";);
$cursor->execute(1);
# while (my @Data= $cursor->fetchrow_array())
if ($_ % 1000 == 0) { print "$_\n" };.
@Data= $cursor->fetchrow_array();
# print "$_\n";
}
Code in ruby:
pk=2
20000.times do |i|
if i % 1000 == 0
print i, "\n"
end
res = my.query("SELECT * FROM product_cars WHERE car_id='#{pk}' LIMIT 500")
res.fetch_row
end
update 2
Exec SQL "SELECT * FROM product WHERE id=1" (string without params) 100000 times
Python: ~15 sec 100% CPU 100%
Perl: ~9 sec CPU 70-90%
Ruby: ~6 sec CPU 60-80%
MySQL-server on other machine.
update 3
Tried use oursql and pymysql - worse results.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如人们所指出的,两者之间准备和执行语句的方式不同,并且不是推荐的做法。两者都应该利用准备好的语句,并且都应该在循环之外进行准备。
但是,Python MySQL 驱动程序似乎没有利用 服务器端完全准备好了声明。这可能是性能不佳的原因。
MySQL 4.1 中添加了服务器端预准备语句,但某些驱动程序适应起来非常缓慢。 MySQLdb 用户指南 没有提及准备好的语句,并认为“MySQL 中没有游标,也没有参数替换” “自 MySQL 4.1 以来,情况并非如此。它还说“MySQLdb 的 Connection 和 Cursor 对象是用 Python 编写的”,而不是利用 MySQL API。
您可能需要查看 oursql 驱动程序。看起来它是为了利用“新”MySQL API 并让数据库自我优化而编写的。
DBD::mysql(Perl MySQL 驱动程序)可以利用准备好的语句,但它根据文档,默认情况下不会。您必须通过将
mysql_server_prepare=1
添加到您的 dsn 来打开它。这应该会使 Perl 示例运行得更快。或者文档是撒谎的并且默认情况下它们是打开的。顺便说一句,虽然没有考虑到 2 分钟的差异,但会偏离基准的一件事是生成随机数。他们的成本很高。
Python 代码
Perl 代码
Python 时间
Perl 时间
为了防止这成为更敏感的基准测试中的问题,请改为增加计数器。
As people have pointed out, the way you're preparing and executing statements between the two is not the same and is not the recommended practice. Both should be taking advantage of prepared statements, and both should be preparing outside the loop.
However, it looks like that Python MySQL driver does not take advantage of server side prepared statements at all. This probably accounts for the poor performance.
Server side prepared statements were added in MySQL 4.1, but some drivers have been very slow to adapt. The MySQLdb users guide makes no mention of prepared statements and thinks "there are no cursors in MySQL, and no parameter substitution" which hasn't been true since MySQL 4.1. It also says "MySQLdb's Connection and Cursor objects are written in Python" rather than taking advantage of the MySQL API.
You may want to look at the oursql driver. It looks like it was written to take advantage of the "new" MySQL API and let the database optimize itself.
DBD::mysql (the Perl MySQL driver) can take advantage of prepared statements, but it does not by default according to the documentation. You have to turn it on by adding
mysql_server_prepare=1
to your dsn. That should make the Perl example run even faster. Or the documentation is lying and they're on by default.As an aside, one thing that will throw off benchmarks, though not account for anything like 2 minutes difference, is generating random numbers. They have significant cost.
Python code
Perl code
Python time
Perl time
To keep this from becoming an issue in more sensitive benchmarks, increment a counter instead.
理论上,如果您在循环之前执行
$cursor = $db->prepare($sql);
并简单地重复执行相同的准备好的查询,您的 Perl 代码应该会显着加速。我怀疑 DBI 或 MySQL 只是缓存并忽略了您重复的相同查询准备工作。另一方面,您的 Python 代码要求每次重新编译不同的查询,因为您没有使用准备好的查询。如果您在循环之前正确准备两个查询,我希望速度差异会消失。顺便说一下,使用准备好的查询也有安全上的好处。
In theory, your Perl code should speed up significantly if you execute
$cursor = $db->prepare($sql);
before the loop and simply reexecute the same prepared query repeatedly. I suspect either DBI or MySQL has simply cached and ignored your repeated identical query preparations.Your Python code, on the other hand, demands that different queries be recompiled each time because you aren't using a prepared query. I'd expect the speed difference to evaporate if you prepare both queries properly before their loop. There are security benefits for to using prepared queries as well, by the way.