这个 MySQL 查询可以优化吗?
我目前尝试优化一个 MySQL 查询,该查询在具有 10,000 多行的表上运行有点慢。
CREATE TABLE IF NOT EXISTS `person` (
`_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`_oid` char(8) NOT NULL,
`firstname` varchar(255) NOT NULL,
`lastname` varchar(255) NOT NULL,
PRIMARY KEY (`_id`),
KEY `_oid` (`_oid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `person_cars` (
`_id` int(11) NOT NULL AUTO_INCREMENT,
`_oid` char(8) NOT NULL,
`idx` varchar(255) NOT NULL,
`val` blob NOT NULL,
PRIMARY KEY (`_id`),
KEY `_oid` (`_oid`),
KEY `idx` (`idx`),
KEY `val` (`val`(64))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
# Insert some 10000+ rows…
INSERT INTO `person` (`_oid`,`firstname`,`lastname`)
VALUES
('1', 'John', 'Doe'),
('2', 'Jack', 'Black'),
('3', 'Jim', 'Kirk'),
('4', 'Forrest', 'Gump');
INSERT INTO `person_cars` (`_oid`,`idx`,`val`)
VALUES
('1', '0', 'BMW'),
('1', '1', 'PORSCHE'),
('2', '0', 'BMW'),
('3', '1', 'MERCEDES'),
('3', '0', 'TOYOTA'),
('3', '1', 'NISSAN'),
('4', '0', 'OLDMOBILE');
SELECT `_person`.`_oid`,
`_person`.`firstname`,
`_person`.`lastname`,
`_person_cars`.`cars[0]`,
`_person_cars`.`cars[1]`
FROM `person` `_person`
LEFT JOIN (
SELECT `_person`.`_oid`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]`
FROM `person` `_person`
JOIN `person_cars` `_person_cars` ON `_person`.`_oid` = `_person_cars`.`_oid`
GROUP BY `_person`.`_oid`
) `_person_cars` ON `_person_cars`.`_oid` = `_person`.`_oid`
WHERE `cars[0]` = 'BMW' OR `cars[1]` = 'BMW';
在我运行 MySQL 5.1.53 的虚拟机上,上述 SELECT 查询大约需要 170 毫秒。大约。两个表各有 10,000 行。
当我解释上面的查询时,结果会根据每个表中的行数而有所不同:
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | Using where |
| 1 | PRIMARY | _person | ALL | _oid | NULL | NULL | NULL | 4 | Using where; Using join buffer |
| 2 | DERIVED | _person_cars | ALL | _oid | NULL | NULL | NULL | 7 | Using temporary; Using filesort |
| 2 | DERIVED | _person | index | _oid | _oid | 24 | NULL | 4 | Using where; Using index; Using join buffer |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
大约 10,000 行给出以下结果:
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6613 | Using where |
| 1 | PRIMARY | _person | ref | _oid | _oid | 24 | _person_cars._oid | 10 | |
| 2 | DERIVED | _person_cars | ALL | _oid | NULL | NULL | NULL | 9913 | Using temporary; Using filesort |
| 2 | DERIVED | _person | ref | _oid | _oid | 24 | test._person_cars._oid | 10 | Using index |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
当我省略 WHERE 子句或当我 LEFT JOIN 另一个类似于 person_cars 的表时,情况会变得更糟。
有谁知道如何优化 SELECT 查询以使事情变得更快一点?
I currently try to optimize a MySQL query which runs a little slow on tables with 10,000+ rows.
CREATE TABLE IF NOT EXISTS `person` (
`_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`_oid` char(8) NOT NULL,
`firstname` varchar(255) NOT NULL,
`lastname` varchar(255) NOT NULL,
PRIMARY KEY (`_id`),
KEY `_oid` (`_oid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `person_cars` (
`_id` int(11) NOT NULL AUTO_INCREMENT,
`_oid` char(8) NOT NULL,
`idx` varchar(255) NOT NULL,
`val` blob NOT NULL,
PRIMARY KEY (`_id`),
KEY `_oid` (`_oid`),
KEY `idx` (`idx`),
KEY `val` (`val`(64))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
# Insert some 10000+ rows…
INSERT INTO `person` (`_oid`,`firstname`,`lastname`)
VALUES
('1', 'John', 'Doe'),
('2', 'Jack', 'Black'),
('3', 'Jim', 'Kirk'),
('4', 'Forrest', 'Gump');
INSERT INTO `person_cars` (`_oid`,`idx`,`val`)
VALUES
('1', '0', 'BMW'),
('1', '1', 'PORSCHE'),
('2', '0', 'BMW'),
('3', '1', 'MERCEDES'),
('3', '0', 'TOYOTA'),
('3', '1', 'NISSAN'),
('4', '0', 'OLDMOBILE');
SELECT `_person`.`_oid`,
`_person`.`firstname`,
`_person`.`lastname`,
`_person_cars`.`cars[0]`,
`_person_cars`.`cars[1]`
FROM `person` `_person`
LEFT JOIN (
SELECT `_person`.`_oid`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]`
FROM `person` `_person`
JOIN `person_cars` `_person_cars` ON `_person`.`_oid` = `_person_cars`.`_oid`
GROUP BY `_person`.`_oid`
) `_person_cars` ON `_person_cars`.`_oid` = `_person`.`_oid`
WHERE `cars[0]` = 'BMW' OR `cars[1]` = 'BMW';
The above SELECT query takes ~170ms on my virtual machine running MySQL 5.1.53. with approx. 10,000 rows in each of the two tables.
When I EXPLAIN the above query, results differ depending on how many rows are in each table:
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | Using where |
| 1 | PRIMARY | _person | ALL | _oid | NULL | NULL | NULL | 4 | Using where; Using join buffer |
| 2 | DERIVED | _person_cars | ALL | _oid | NULL | NULL | NULL | 7 | Using temporary; Using filesort |
| 2 | DERIVED | _person | index | _oid | _oid | 24 | NULL | 4 | Using where; Using index; Using join buffer |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
Some 10,000 rows give the following result:
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6613 | Using where |
| 1 | PRIMARY | _person | ref | _oid | _oid | 24 | _person_cars._oid | 10 | |
| 2 | DERIVED | _person_cars | ALL | _oid | NULL | NULL | NULL | 9913 | Using temporary; Using filesort |
| 2 | DERIVED | _person | ref | _oid | _oid | 24 | test._person_cars._oid | 10 | Using index |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
Things get worse when I leave out the WHERE clause or when I LEFT JOIN another table similar to person_cars
.
Does anyone have an idea how to optimize the SELECT query to make things a little faster?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
它很慢,因为这将强制对人员进行三个全表扫描,然后将它们连接在一起:
考虑到 where 子句,左连接实际上是一种内部连接。您可以在与人员的连接实际发生之前添加条件。该连接也不必要地应用了两次。
这将使速度更快,但如果您有一个 order by/limit 子句,由于子查询中的 group by ,它仍然会导致对 person 进行全表扫描(即仍然不好):
如果您应用 order by/限制,您将通过两个查询获得更好的结果,即:
然后使用生成的 id 通过 IN () 子句选择汽车。
哦,您的
vals
列可能应该是 varchar。It's slow because this will force three full table scans on persons that then get joined together:
Considering the where clauses the left join is really an inner join, for one. And you could shove the conditions before the join with persons actually occurs. That join is also needlessly applied twice.
This will make it faster, but if you've an order by/limit clause it will still lead to a full table scan on persons (i.e. still not good) because of the group by in the subquery:
If you apply an order by/limit, you'll get better results with two queries, i.e.:
And then select the cars with an IN () clause using the resulting ids.
Oh, and your
vals
column probably should be a varchar.检查这个
Check This