重写这个子查询?

发布于 2024-09-26 17:37:23 字数 2424 浏览 1 评论 0原文

我正在尝试构建一个新表,以便现有表中的值不包含在另一个表中(但显然以下检查包含)。以下是我的表结构:

mysql> explain t1;
+-----------+---------------------+------+-----+---------+-------+
| Field     | Type                | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+---------+-------+
| id        | int(11)             | YES  |     | NULL    |       | 
| point     | bigint(20) unsigned | NO   | MUL | 0       |       | 
+-----------+---------------------+------+-----+---------+-------+

mysql> explain whitelist;
+-------------+---------------------+------+-----+---------+----------------+
| Field       | Type                | Null | Key | Default | Extra          |
+-------------+---------------------+------+-----+---------+----------------+
| id          | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment | 
| x           | bigint(20) unsigned | YES  |     | NULL    |                | 
| y           | bigint(20) unsigned | YES  |     | NULL    |                | 
| geonetwork  | linestring          | NO   | MUL | NULL    |                | 
+-------------+---------------------+------+-----+---------+----------------+

我的查询如下所示:

SELECT point 
  FROM t1 
 WHERE EXISTS(SELECT source 
                FROM whitelist 
               WHERE MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))));

解释:

    +----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
| id | select_type        | table              | type  | possible_keys     | key       | key_len | ref  | rows | Extra                    |
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
|  1 | PRIMARY            | t1                 | index | NULL              | point     | 8       | NULL | 1001 | Using where; Using index | 
|  2 | DEPENDENT SUBQUERY | whitelist          | ALL   | _geonetwork       | NULL      | NULL    | NULL | 3257 | Using where              | 
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+

对于 t1 中的 1000 条记录,查询需要 6 秒执行,这对我来说是不可接受的。如果我没有要连接的列,如何使用连接(或者可能是更快的方法,如果存在)重写此查询?我想在最坏的情况下,即使是存储过程也是可以接受的。我的目标是最终创建一个包含 t1 中的条目的新表。有什么建议吗?

I am trying to build a new table such that the values in the existing table are NOT contained (but obviously the following checks for contained) in another table. Following is my table structure:

mysql> explain t1;
+-----------+---------------------+------+-----+---------+-------+
| Field     | Type                | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+---------+-------+
| id        | int(11)             | YES  |     | NULL    |       | 
| point     | bigint(20) unsigned | NO   | MUL | 0       |       | 
+-----------+---------------------+------+-----+---------+-------+

mysql> explain whitelist;
+-------------+---------------------+------+-----+---------+----------------+
| Field       | Type                | Null | Key | Default | Extra          |
+-------------+---------------------+------+-----+---------+----------------+
| id          | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment | 
| x           | bigint(20) unsigned | YES  |     | NULL    |                | 
| y           | bigint(20) unsigned | YES  |     | NULL    |                | 
| geonetwork  | linestring          | NO   | MUL | NULL    |                | 
+-------------+---------------------+------+-----+---------+----------------+

My query looks like this:

SELECT point 
  FROM t1 
 WHERE EXISTS(SELECT source 
                FROM whitelist 
               WHERE MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))));

Explain:

    +----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
| id | select_type        | table              | type  | possible_keys     | key       | key_len | ref  | rows | Extra                    |
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
|  1 | PRIMARY            | t1                 | index | NULL              | point     | 8       | NULL | 1001 | Using where; Using index | 
|  2 | DEPENDENT SUBQUERY | whitelist          | ALL   | _geonetwork       | NULL      | NULL    | NULL | 3257 | Using where              | 
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+

The query is taking 6 seconds to execute for 1000 records in t1 which is unacceptable for me. How can I rewrite this query using Joins (or perhaps a faster way if that exists) if I don't have a column to join on? Even a stored procedure is acceptable I guess in the worst case. My goal is to finally create a new table containing entries from t1. Any suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

月亮邮递员 2024-10-03 17:37:23

除非查询优化器失败,否则 WHERE EXISTS 构造应产生与使用 GROUP 子句的联接相同的计划。看看优化 MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)')))),这可能是您的查询花费所有时间的地方。我对此没有建议,但这是用 JOIN 编写的查询:

Select t1.point
from t1
join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
group by t1.point
;

或者获取 t1 中不在白名单中的点:

Select t1.point
from t1
left join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
where whitelist.id is null
;

Unless the query optimizer is failing, a WHERE EXISTS construct should result in the same plan as a join with a GROUP clause. Look at optimizing MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)')))), that's probably where your query is spending all its time. I don't have a suggestion for that, but here's your query written with a JOIN:

Select t1.point
from t1
join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
group by t1.point
;

or to get the points in t1 not in whitelist:

Select t1.point
from t1
left join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
where whitelist.id is null
;

故乡的云 2024-10-03 17:37:23

这似乎是一种反规范化 t1 可能有益的情况。添加值为 GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))GeomFrmTxt 列可以加快您已有的查询速度。

This seems like a case where de-nomalizing t1 might be beneficial. Adding a GeomFrmTxt column with a value of GeomFromText(CONCAT('POINT(', t1.point, ' 0)')) could speed up the query you already have.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文