mysql 子查询性能
我的 SQL(带有子查询)花了很长时间(接近 24 小时)。使用子查询对性能不利吗?
我的表如下
mysql> show create table eventnew;
CREATE TABLE `eventnew` (
`id` int(50) NOT NULL AUTO_INCREMENT,
`date` datetime DEFAULT NULL,
`src_ip` int(10) unsigned DEFAULT NULL,
`src_port` int(10) unsigned DEFAULT NULL,
`dst_ip` int(10) unsigned DEFAULT NULL,
`dst_port` int(10) unsigned DEFAULT NULL,
`repo_ip` varchar(50) DEFAULT NULL,
`link` varchar(50) DEFAULT NULL,
`binary_hash` varchar(50) DEFAULT NULL,
`sensor_id` varchar(50) DEFAULT NULL,
`repox_ip` int(10) unsigned DEFAULT NULL,
`flags` varchar(50) DEFAULT NULL,
`shellcode` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`),
KEY `sensor_id` (`sensor_id`),
KEY `src_ip` (`src_ip`)
) ENGINE=MyISAM AUTO_INCREMENT=883278 DEFAULT CHARSET=latin1
我的 SQL 如下:
SELECT COUNT( DISTINCT binary_hash ) AS cnt
FROM eventnew
WHERE DATE >= '2010-10-16'
AND DATE < '2010-10-17'
AND binary_hash NOT
IN (
SELECT DISTINCT binary_hash
FROM eventnew
WHERE DATE < '2010-10-16'
AND binary_hash IS NOT NULL
)
下面是运行结果解释:
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| 1 | PRIMARY | eventnew | range | date | date | 9 | NULL | 14296 | Using where |
| 2 | DEPENDENT SUBQUERY | eventnew | range | date | date | 9 | NULL | 384974 | Using where |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
my SQL(with sub-queries) take so long(nearly 24hour). Is using sub-queries is not good for performance?
My table as below
mysql> show create table eventnew;
CREATE TABLE `eventnew` (
`id` int(50) NOT NULL AUTO_INCREMENT,
`date` datetime DEFAULT NULL,
`src_ip` int(10) unsigned DEFAULT NULL,
`src_port` int(10) unsigned DEFAULT NULL,
`dst_ip` int(10) unsigned DEFAULT NULL,
`dst_port` int(10) unsigned DEFAULT NULL,
`repo_ip` varchar(50) DEFAULT NULL,
`link` varchar(50) DEFAULT NULL,
`binary_hash` varchar(50) DEFAULT NULL,
`sensor_id` varchar(50) DEFAULT NULL,
`repox_ip` int(10) unsigned DEFAULT NULL,
`flags` varchar(50) DEFAULT NULL,
`shellcode` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`),
KEY `sensor_id` (`sensor_id`),
KEY `src_ip` (`src_ip`)
) ENGINE=MyISAM AUTO_INCREMENT=883278 DEFAULT CHARSET=latin1
my SQL as below:
SELECT COUNT( DISTINCT binary_hash ) AS cnt
FROM eventnew
WHERE DATE >= '2010-10-16'
AND DATE < '2010-10-17'
AND binary_hash NOT
IN (
SELECT DISTINCT binary_hash
FROM eventnew
WHERE DATE < '2010-10-16'
AND binary_hash IS NOT NULL
)
below are result running EXPLAIN:
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| 1 | PRIMARY | eventnew | range | date | date | 9 | NULL | 14296 | Using where |
| 2 | DEPENDENT SUBQUERY | eventnew | range | date | date | 9 | NULL | 384974 | Using where |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用子查询肯定会影响您的性能。例如,假设表 T1 有 'n' 条记录,T2 有 'm' 条记录。当您对 T1 和 T2 进行联接时,它将获取 n*m 条记录,然后根据您的条件对它们进行排序。同样的情况也适用于 in 关键字。如果子查询中有另一个约束,则会进一步降低效率。然而,在实践中使用子查询是不可避免的。
Using subqueries certainly does affect your performance. For instance, lets say a Table T1 has 'n' records and T2 has 'm' records. when you do a join on T1 and T2, it will take n*m records and then will sort them based on your condition. The same case goes with in keyword as well. and if you have another constraint in subquery, it would further decrease the efficiency. However, using subqueries couldn't be avoided in practice as they are meant to be.
我建议您使用
NOT EXISTS
而不是NOT IN
。I'd suggest you use
NOT EXISTS
instead ofNOT IN
.试试这个
Try this